Skip to Main Content

Research Guide

This guide is an aid for those performing basic academic research.

Controlled Vocabularies

Controlled Vocabulary

Use a Controlled Vocabulary

Takes the Guess Work out of Searching
A controlled vocabulary makes a database easier to search. Since we have many different ways of describing concepts, drawing all of these terms together under a single word or phrase in a database makes searching the database more efficient as it eliminates guess work. However, arriving at this efficiency requires consistency on the part of the individual indexing the database and the use of pre-determined terms.

A controlled vocabulary is a list standardized terms used by Librarians to describe the content of a source material. Also known as subject headings, subject terms, thesaurus terms, or descriptors – they are the official indexing terms used in a Library by catalogers and database indexers to describe each concept so that all items on the same topic have the same subject heading or descriptor.

The benefit of subject headings in our library databases is that they allow you to easily identify more sources on similar topics, just by entering the terms.

Do all databases use controlled vocabularies?

While all academic libraries use subject headings in their cataloging (the controlled vocabulary known as the Library of Congress Subject Headings), not all individual databases do. Those that do typically have a subject field listed in the advanced search options.  They may also link to their vocabulary list (look for a link to "subjects" or "thesaurus" within the database).

Why use a Controlled Vocabulary?

Conducting a search in a database that uses controlled vocabulary or indexing terms is efficient and precise. The biggest advantage to controlled vocabulary is that once you do find the correct term, most of the information you need is grouped together in one place, saving you the time of having to search under all of the other synonyms for that term.

Most academic database search algorithms are created by librarians and are set up using controlled vocabularies. This is because materials are cataloged and indexed by librarians and they follow a standard so materials can be searched for efficiently.

 

Finding a Balance: Controlled Vocabulary or Free Text

It is difficult to say whether controlled vocabulary or Free Text systems give the best retrieval performance. Free Text or Natural Language systems often provide more results in a shorter time span because you are searching all the fields of a given database (the Google search engine is a form of free text search). Such searches work well for very specific searches, however, when a topic is older or broader in scope, you likely will retrieve many irrelevant hits. You also may miss some records relevant to your search because you didn't choose the proper search term. As with a web search, searching a database requires striking a balance between preciseness and generating enough hits to make the search successful.

In contrast to “free text” searching, controlled vocabularies organize the information in a database. You can click on a heading to see all the other items that have the same heading. You can also combine the terms in a new search to find items on that topic.

Your search results will be more focused and more relevant, since you will be searching directly in the subject or descriptor field.

Subject Terms:

  • Environment soil
  • Soil Pollution
  • Soil Microbiology
  • Soil erosion

 

Stop Words
In many online databases you should keep in mind that there are certain words that are ignored. These are called "Stop Words." Common stop words are words such as 'the', 'a', 'an', 'this', and 'that'. While stop words may provide some useful content in Natural Language Processing; most keyword based algorithms do not use grammars to analyze user input, so this content is not used effectively.