You are on page 1of 29

Orsedeativn and Review:of laprzwirmyrairdirg meastowl 230

UNIT TWENTY
INTRODUCTION TO, INDEXING

OBJECTIVES
At the end of,this unit, the trainee should be able to:
Define given terns,
Understand the purposes and uses of indexes.
Describe the-indexing process.

20.1 DEFINITION OF TERMS


➢ Index: This tennis a Latin word that means he who or that.w,hich
points the way. Therefore, an index is a systematic.guidejnointer) to the
items contained in a collection or to concepts derived from the
9D11ection. The index entries are arranged in a known or seamhahle order
(commonly alphabetical arrangement). According to the British
Standards Institute (BSI), an index is "a systematic guide to the location of
words, concepts or other items in books,periodicIS Or other
publications". An index entry consists of two components namely; a
descriptor and locatiOn indicator(s), e.g. page number or bibliographic
citation of the indexed item.
Indexing: This is "an operation intended to represent the results of
analysis of a document by means of an indexing language".
➢Indexing language (Index vocabulary): " The set of descriptors to be
used in indexing the contents of documents in an information storage
and retrieval system".

➢ Descriptor: It's a term, notation or string of symbols used to indicate


the subject content of a document.
➢ Keyword: Raw words that come from the documents indexed.

"e T. LOkbaa 2061


Orgameatton .d Retrieval of Ittlitrinattant Training manual 231
Orroovizatio0 000 Mariam! of Information: Training marmal 232

20.2 TYPES OF INDEXES


Periodical indexes are prepared as long as the periodical is still being
The following types of indexes can be identified:
published, hence can be said to be prepared over a longer time frame with
(i) Author Index: This is an index that lists alphabetically the authors of
different indexers. Periodical indexes can focus on one subject area (such an
documents. The authors in this case could be personal or corporate
index is called special, e.g. Index Medicus focuses on
authors.
periodicals in the medical field) or may cover many subject areas (such an
(ii) Title Index: It is an index that lists alphabetically the titles of works
index is called general, e.g. New York Times index).
with an indication of their bibliographic details and their location, e.g.
(vii) Classified index: This index is arranged logically according to the
Key Word In Context (KWIC), Key Word in Out of Context (KWOC),
notation derived from a classification scheMe. The classified index can
KWIC index consists of alphabetically arranged keywords highlighted in
take two forms: it can have it's index entries arranged using specific class
the title. KWOC index consists of alphabetically arranged keywords
notations , e.g. Chemical Abstracts, Library and Information Sciences
repeated out of context of the title,
Abstracts (LISA) or the index entries may be arranged under broad subject
(iii) Alphabetical subject index: It is an index that lists subject entries
categories, e.g. Biological Abstracts.
alphabetically.
(viii) Coordinate indexes: These indexes are prepared by combining two or
(iv) Word/concordance index: It is an index that lists alphabetically
more single index terms to form the index entry, e.g, pre-coordinate and
descriptors or terms as they are used by the author in the document , e.g.
post-coordinate indexes. Pre-coordinate indexes arrange their
the index to the Bible, biography index, patent number index, etc. In
descriptors in one sequence established at the indexing stage, e.g. hook
word indexing, natural indexing language is used (that's the words used
indexes. Post-coordinate indexes consist of simple concepts that are
by the author are selected exactly the way they arc to be index terms).
combined at the searching stage, e.g. computer-searchable indexes.
No cross-references are established in such indexes. Word indexes are
(ix) Citation index: It is a list of works that have been cited in a given year
less intellectually involving to prepare and may be computer generated.
and underneath each cited item is arranged the works that have cited the
(v) Book index: This is an index that appears at the back of the book. It
original work, e.g. Social Sciences Citation Index (SCI).
consists of alphabetical list of words with page references to the subject,
(x) Chain indexes: These are indexes constructed based on the class
concept or name of where it is found in the book. A book index is
notation of an item following the chain indexing procedure.
usually compiled by one person in order for the indexing to be
(xi) Permuted term index: In this index "each significant element in the
consistent. The index is compiled over a short period of time and is
title is made a filing word in turn".
meant for the specific book indexed. Whenever a book is revised, a new
index is prepared.
20.3 PURPOSES OF AN INDEX
(vi) Periodical index: This index covers every issue of the periodical.
Mulvany, 1994: 5-6, states the purposes of an index as follows:
Therefore, it assists the user to locate information contained in the
(i) To identify and locate relevant information within documents being
periodicals, e.g. Index Medicus, Catholic periodical and literature
indexed.
index, New York Times index, Readers guide to periodical literature,
(ii) To discriminate between information on a subject and passing
Index to South African periodicals, etc. Periodical indexes contain
mention of a subject.
bibliographic citations of every article indexed and are more detailed.

CT.!. OkInd0 2001


0 T. 1. Okinda 1001
Orgoniration and Reniorol of Information: Training ManiI d 233 Organiortion ani Retrferai of Information: Training manual
234

(iii) To exclude passing mention of subjects that offer nothing 20. 5 INDEEING PROCESS
.• significant to the user. The following factors should be considered before indexing information
(iv) To analLzeconsepts treated in a work so as to produce a series of materials in a library:
headings bused on its terminology. (i) indeximiskol done . The indexer should
(v) To indicate relationships between eencepts or items,. decide on the amount of detail to index.This will be influenced by
(vi) Tog roup together information on subjectswhich is scattered by the „thedocument indexed, the tiR2 available for indexing, etc.
arrangetnept of the document or library collection. (vii (ii) The level of generality and specificity at which concepts are to be
)To synthesize headings and subheadings. represented. Specificity refers to "the extent to which the system
(viii) To dzit.tilLuisgosakiiiLitr,m,iton under not chosen fOr permits the indexer to be precise when spec ng the subject of an
the indexept9i to those choseninthe index through yross- entity ". Generality or exhaustivity.relates to the number of index
references. terms used.
(ix) To arrange entries into a systematic and helpful order. (iii,) Ensure that consistency w111 be achievv d in indexing,
(iv) Ensure that the indexing is done to meet the users' inquiries.
20.4 FUNCTIONS AND USES OF INDEXES (v) 1::lln2Lk how tos9optbsindlxj2542147.
(i) Indexeskide thcusers to the concepts Or items contained in a (vi) Design on the structure of the index entries.
document or library collection respectively.
(ii) They reveal to the user whether a document or library collection Steps involved in the indexing process
Contains information or items on the topic he/ she desires to seek The process of indexing information materials in a library will usually
information and information materials on. involve the ,following steps:
(iii) Based on the above point, indexes save the user's time in retrieval of

(1) Deciding on the indezable matter


(iv) The author indexes collocate authors' documents, therefore facilitates The indexer should be able to determine the users' information needs so as to
the !Littr appronch to retrieval. decide on what is relevant in the document and also if the document is worthy
(v) Thetergyielexgyide le concepts or items theyse_rwish to recall indexing. Besides this, the indexer decides on which parts of the item to index or
does know that exist. not index.
(vi) The cross references in indexes assist users inretrieving other related
materials. (ii) Familiarization (technical reading) of the document to be indexed
(vii) Indexes shoW the strengths and weaknesses of a document or This involves e_xa_nu,..n..i..n. W eltlx various parts of the_tRt in order to gain an
• ,..•
Lii deatet.tefothesglieetmatter of the document For a printed book,
documeets or 1it=eolle0ien.
the Ode:a ble matter will be determined by examining the title, table of
(viii) Nomenclature, t,ernaltialggy and Telling areoftenhelefullyprovided
by indexes and their introductions. contents, abstract or summary if provided, the preface, the introduction,
opening paragraphs of chapters, the conclusion and the captions of
illustriatioinstand ta,bleis‘..,/„ ... " ft, 4.4„.„ .... ) 1 0 jra
_
I
1.7)7/4N(21.e, r z e 7'7 074- i

4..-iiig,). h ....... ,,p,y, )7 i irb I iiis i ,/.11 - )1 ,10T.zummi2,01


C T r. Wanda 2001

61 , 1 ." - 114:5 11110747.) i i


)1:),),Ierlio.))
t:.--‘ ', r 14.' ,'' /147 .." . ..V. /.. i.:g. W., --r7(4/ be„,.1., ..,
Mfirmatto ra Trainbq manual 236
IOW 235

• Check headings for ambiguity.


(III) identifying the intistable soncepts/ analysis of the document
ure that they connect the right
The intim • Check cross-references to ens
identlflie tho *owe uoncepts or information from the
terms.
document indexed as he/ elm dose fainiliarization. The location of the cuneepts • Correct spelling errors.
should also ha noted. The location can be in terms of page number or
• Check for consistency in punct uation.
bibliographic details. During analysis, the indexer should verify the relevancy of
layout of the index entries.
• Check the typefaces and
terms to include in the index.
bdividing them.
• Edit long index entries by further su
ay have been left out.
• Add any index entries that m
(iv) Translating the indexable cancepts into the indexing-language used, e.g.
• Prepare and write an introduction to assist users of the index,
consult or use a list of subject headings or thesaurus in order to choose
form appropriate for the printer.
the accepted index terms. • Prepare the index Mt° a
ndex to ensure that it's in the proper
• Final checking of the i
format.
(v) Recorditig the indextermeconeents onto a searchable medipen, e.g.
ructions.
• Prepare the printer's inst
5x3 cards or magnetic tapes so as to facilitate filing of entries.
•• Pruning or reducing the first printed index to the desired

(vi) Arranging theinclex..entries into a logical order, usually alphabetical. length.

(vii), Edit the index ready for printing. 20.6. THE INDEXER' S ROLE IN INDEXING
The indexer performs the following tasks in the process of indexing
Editing the index information materials

To prepare an index into it format ready for printing the following steps (i) Identifies documents to be indexed.

should be or should have been undertaken: (ii) identifies and locates co ncepts in the document that are worthy
• indexing.
Alphabetization.412#50Zation of the main, headings in order tosorrect
(iii) Analyses the concepts that have been identified for indexing.
nnafilin&errors. This can be done manually by checking
through the index entries or by use of a computer in computer - (iv) Translates the indexa ble concepts into the indexing language used

assisted indexing. by the information cent er.

• Alphabetization of subheadings. (v) Indicates relationsh ips between the indexable concepts.

• (vi) Arranges the index entries into a logical manner, usually


Eliminating synonymous headings or subheadings by adopting
15
the preferred term (s). alphabetical.
(vii) Prepares 'see and see also' references.
• Elimination of specific headings or subheadings not greatly
(viii) Combines headings and subheadings into coherent entries.
covered in the work and which are not likely to be sought.
(ix) Collocates inde xable concepts that are scattered in the text indexed.
• Resolution of too many locators for a given concept. Do not
use more than seven locators for a given heading. If more
prefer preparing subheadings to such a heading.

o T. 1. 104040 2501

dJ T 1. Okhoda 2501
Organization and Retrieval of Informal.: PgItOrg.asual
237 Orgeuiltarieri ud Rarievel of Afimeattes: Training atantrai 238

UNIT TWENTY ONE


N
Highlight SIX factors that should be taken into consideration by an TE INDEXING
PRE-COORDINATE AND POST-COORDINA
indexer when deciding on essential indexable matter for a printed book. SYSTEMS
(12 marks)
2.
What items may be considered for inclusion in a newspaper index? (8
'marks)
OBJECTIVES
3.
Highlight FOUR purposes of indexing in an information center. (8 At the end of this unit, the trainee should be able to:
marks)
Define given terms.
4.
With respect to retrieval of information, highlight the steps involved in re- and post-
Identify the advantages and disadvantages of p
the indexing process. (8 marks)
coordinate indexing.
5;
Explain FOUR indexing mistakes an indexer is likely to make during Search a pre-, and post-coordinate index.
the indexing process. (8 marks) xing.
Differentiate between pre- and post-coordlnate inde

21.1 INTRODUCTION
Coordinate indexing involves combining two or more sin gle index terms to
form an index entry. There arc two types of coordinate in dexing systems
namely: pre-coordinate indexing system and post-coordinate indexing
system.

21.2 PRE - COORDINATE INDEXING


Pre•coordinatetnclexing is an indexing system or operation in which the
coordination/ combination of terms is done at the indexing stage in
anticipation of the users', approach. In pre-coordinate indexes all the
constituent elements or concepts of a subject are given at one point. Pre-
tation order for
coordinate indexing involves the indexer establishing a ci
composite concepts or subjects. Therefore, the indexer sh ould establish a

citation order that is helpful, to users or one that users a re likely to search
concepts or items under.

An article entitled "A thesaurus for periodicals indexing in


Africa" will be indexed under Periodicals -Indexing-Africa. A

OE LOkbas 2011 T.1 Okbula 2001


001
Orjamisetio. and Remlawal.1 Irsforms0001 401004 01•0 239
Orpugzatiox aid Retrfaval of Information Train* manual
240

user who requires information on this article should search at


(ii) The searcher does not have freedom to combine terms as he/she desires
es under
Periodicals -Indexing-Africa. A user who search as they are already combined by the indexer.
improve
Indexing or Africa will not get the article: However, to (iii) The user must know the citation order used in order to successfully
guide users to
access in the index, cross references are prepared to
search the pre-coordinate index. The index has a fixed citation order.
the accepted terms. In this case, under the terms Indexing and
(iv) It may not be exhaustive as a particular sequence of coordinated terms
e index
Africa a reference would be provided to guide the user to th may not meet the approaches of all users of the index.
entry Periodicals -Indexing-Africa.
(v) It does not facilitate specific searching as compound subjects are already
built in the index by the indexer.
Examples of pre- coordinate indexes
(vi) It's difficult to update a pre-coordinate index as the index entries must
• Classification schemes. be integrated logically with those already in existence.
• Printed lists of subject headings, book indexes and printed
thesauri.

Advantages of pre - coordinate indexing


(i) Its easier to search pre-coordinate indexes because terms are already
combined at the time of indexing.
(ii) Indexing and searching is consistent because the index/ search terms are
already combined by a qualified indexer.
(iii) It saves time in retrieval as search terms are already combined and also
specific terms are collected and brought together to form compound
subjects.
(iv) It minimizes false drops in retrieval (retrieval of irrelevant items) as
21.3 POST - COORDINATE INDEXING
terms are already coordinated.
This is an indexing system/ operation in which concepts are indexed
(v) It's less intellectually involving to the searcher as coordination of te rms
independently of each other and desired concepts are combined at the time of
is already done by the indexer.
searching. In post-coordinate indexes, Boolean search logic is used to combine
(vi) A pre - coordinate index is likely to contain few entri es, thus saving
index/search terms.
time in retrieval.

Examples of post • coordinate.indexes


Disadvantages of pre- coordinate indexing
• Computerized indages.
(i) It is intellectually involving to the indexer and time consuming to
• Peek - a boo Optical coincidence cards.
prepare. This is because the indexer has to combine concepts for
• Edge notched cards.
composite subjects.
U

C r 1. Okuda 2091
0 T. I. Mantles 2001
thlf.faufges
and Itro*s.al f hcformAtirvhstsing mutual 242
Organizadox end lietrOPO of InfOntOin• Tralaing manual
241

Boolean search logic Advantages' of Mist- coordinate indexing


for documents of multi -
(i) It provides detailed information retrieval
Boolean logic is used to search/coordinate terms in most pest -coordinate
combine terms as he/she
indexes. It is used to link or combine terms present in the statement of' complex concepts or subjeets, The searcher can
search. ,Examples of search logic are : AND, OR, NOT. wishes iii order to specify the search.,
(ii) It's flexible as the search terms are combined as desired at the time of

Examples of Boolean logic searches searehing.


s with hard vocabulary, e.g.
(iii) It's highly suitable for indexing subject
• If a user wants, to search information on documents' covering
nd film collections.
physical and biological sciences, picture a
"Recruitment of staff", then the logical product (conjunctive
bined bythe searchers
(iv) It's likely to be exhaustive as terms may be com
search type Boolean logic) AND will be used. The search
o ible, resulting in
as desired. Thus, many combinations are p ss
formula would then be Recruitment AND Staff:
exhaustivity.
• If a user wants to search information on documents covering
(v) It speeds up the processes of indexing, filing and searching.
either "Physics or Chemistry", then the logical sum (additive
ill be added without
(vi) Updating the index is easier as the concepts w
search type Boolean logic) OR will be used. The search
reference to those already existing.
formula would then be Physics OR Chemistry.

• If a user wants to search information on documents covering


Disadvantages of post- coordinate indexing
'either Physics and Chemistry but not Organic Chemistry,
(i) It is suitable to small size collections.
then the logical difference (subtractive search type Boolean
(ii) It does not work well with subjects with soft vocabulary, e.g. social
logic) NOT will be used. The search formula would then be
sciences.'
Physics AND Chemistry,NOT Organic Chemistry.
(iii) There is a likelihood of irrelevant documents being retrieved (false
drops) due to illogical coordination of terms by the user.
(iv) A user is likely to scan several entries if he /she does not precisely
coordinate the terms to meet his/her inquir y. ,
(v) Its intellectually involving to the searcher who has to coordinate the
search terms at the searching stage.
(vi) To effectively utilize a post -coordinate index, basic instructions to the
user are essential .

0 T. L O&Inda 2001
Cr. L Ottaok 7001
Or*an,raN.a phi 243 Organization and Rarteral of Informad.n: Training mama! 244

20.4 DIFFERENI 'ES BETWEEN PRE- AND POST-COORDINATE


CSN7 UNIT TWENTY TWO
INDEXING
INDEXING LANGUAGES

PRE-COORDINATE INDEXING POST-COORDINATE INDEXING

I. Intellectually involving to the 1. Intellectually involving to the OBJECTIVES

indexer searcher At the end of this unit, the trainee should be able to:

2. Terms combined at the indexing 2. Terms combined at the searching Define given terms.

stage stage Describe the features of indexing languages.


3. Has fixed citation order 3. Has a flexible fixed citation order Identify the advantages and disadvantages of natural and artificial
4. Not good for specific searching of 4. Good for specific searching of indexing languages.
composite subjects composite subjects Distinguish between natural and artificial indexing languages.
5. Low occurrence of false drops 5. High occurrence of false drops
6. Likely to achieve consistent 6. May not likely achieve consistent
22.1 INTRODUCTION
indexing and searching indexing and searching
The major three steps involved in indexing are: familiarization, analysis
7. Not easy to update 7. Easy to update
and translation. Translation involves the use of indexing languages_ The
8. Easy to use 8. Slightly difficult to search
term indexing or documentary language refers to "the conventional
9, Good for subjects with soft' 9. Good for subjects with hard
language used by the information unit to describe the contents of documents
vocabulary vocabulary
with a view to the storage and retrieval of information". The indexing
language is used for the intellectual processing of documents. An indexing
language consists of
• Words used to describe the subject.contents of documents or
which can be used in sex_e_RE in an information ,sterage and
„JM
retrieval system.

I. Explain FOUR uses of Boolean search logic in post coordinate • Rules for the applicAtiori or use of the indexing language.
indexing. (8 marks)
2. Highlight SIX features of a computerized post- coordinate retrieval 22.2 CRITERIA FOR DISTINGUISHING INDEXING LANGUAGES
system. (12 marks) Indexing.languages are distinguished by several criteria namely:
3. Explain SIX disadvantages of post -coordinate indexing with respect to (i) The ordering or construction principle: An indexing language
retrieval of information. (12 marks) May have a predeterrninedorder (e.g. a classification scheme), may
4. Highlight SIX features of pre-coordinate indexing systems. (12 marks) be arranged accOrding'to frequency and use of descriptors (e.g. lists
5. Distinguish between Pre- and Post - coordinate indexing. (12 marks) of subject headings) or may be an authority list of terms based on one or
more points of view (e.g. faceted languages).

0 T. I. Okimin 2007
T. I. °Undo 2991
Onastiflop oi/Reeieralsf Infernal., Thdebts
244,
Orrusitalion asdReerierel of Wormadem Trairthig manual 245

22.4 THE LAYOUT OF AN INDEXING LANGUAGE


(ii) The size of the subject field covered: The indexing langu age may
Indexing languages can be in printed or machine format, and will usually
have a general subject cov e ( e.g. encyclopedic languages), may
include:
cover a specific subject fielsbkgiven aspects of a ssubject field (e.g.
(i) An introduction that provides guidance as to the contents,
special classification schemes) or may cover a small partof a
organization and use of the indexingignguage.
subject field (e.g. num thesaurus). -
(ii) Lists of descriptors alphabetically arranged or otherwise logically
(iii) Thelypes of words used: The word may be single terms (e.g.
arranged.
uniterm indexing languages), single and.comound wor,ds
Graphic displays showing the relations between terms.
lists of subject headings) or direct and inverted forms of headings__ (e.g.
lists of subject headings).
22.5 TYPES OF INDEXING LANGUAGES
(iv) TyPe of arrangement This may be gs.ternatic eg. classification
There ar'e three main types of indexing languages namely:
schemes) or alphabetical (e.g. lists of subject headings).
• Natural Indexing language

• Artificial language and


22.3 COMPONENTS OF AN INDEXING LANGUAGE
• Free text (hybrid) indexing language.
An indexing language consists of the following components:
(i) Descriptors/preferred or accepted terms: These are words used
22.5.1 NATURAL INDEXING LANGAUGE
to describe information or concepts. Theie terms are accepted for
Natural /spoken/uncontrolled vocabulary indexing language is an indexing
use in the indexing system.
language in which all index terms are taken from the document and are then
(ii) Non - preferred terms/ Non- Descriptors: These are terms
entered alphabetically. When using this indexing language we take words as
contained in the indexing language but are not accepted for use.
they appear in the document and as expressed by the author. This language
Thcy are linked to the descriptors through cross -references.
is used
(iii) Syndetic devices: These are devices used to indicate relations repare book indexes, title indexes, concordances, etc.
between descriptors .
Advantages of natural indexing language
(iv) Relations between descriptors . this case the descriptors are
logically grouped into sets (main classes) and subsets (subclasses).
(i) It is less intellectually involving to t4e indexer as ter= are directly
selected from the documents indexed.
(v) Notations: The notations may be numeric, alphanumeric or
(ii) There is no Wrong assignment of index terms as misinterpretation of the
symbolic. These notations are used to identify descriptors..
author's words and meaning is avOided by adopting the terms the way
(vi) General or scope notes: They state for given descriptors the
they appear in the document indexed.
context in which the descriptor should used.
(iii) It has low input costs as no vocabulary control tools are used.
(vii) Graphic displays showing descriptors and their relations. „ -
(iv) It facilitates easy exchange of information between databases as
language incompatibility is removed.

0-7. I. Mad. SW or. a Mimic MI


il.l toAtt sit 247
itfwdsztkismiLlictScrslitr IRfirrasatioss ffreirslAgmanwal 248

(v) Faster speed et sending is possible as the avarohor does not require to Examples of artificial indexing language tools.
look under preferred terms. Thia le became it does not apply cross
• Thesauri.
references.
• Classification schemes.
(vi) It is exhaustive as many Index twins may be selected to describe a
• Lists of subject headings, etc.
document's militants. This helps improve recall ratio. Recall ratio is "
the ratio between the number of relevant works retrieved in a literature
Vocabulary control In an Indexing language
search and the number contained in the bibliographic sources used in
Vocabulary control in artificial indexing language is exercised by using
the search."
syndetic devices as described here.
(vii) It is up to date as the terms are selected from the document at hand as
(i) Control of synonyms or variant word forms is exercised
they appear there. This enhances upto date indexing and searching.
through the SEE reference or USE instructions.
(viii) It is likely to employ terms commonly used by searchers.
(ix) Helps achieve specificity and provides the possibility for retrieving
Examples
specific terms, e.g. names of persons, corporate bodies, etc.
• Footpaths SEE Trails
• Labor USE
Disadvantages of natural Indexing languages
(i) Due to its exhaustivity, low precision ratio is likely to occur. Precision
In the above examples, the underlined words are descriptors; those in
ratio shows the relationship between relevant items retrieved and the total
bold are syndetic devices or cross references while those not
number of items retrieved in a literature search.
underlined are non-descriptors. This means that the underlined
(ii) There is a problem in choosing terms with many synonyms, e.g.
words are the ones accepted for use as index terms or search terms.
footpaths, walkways, paths„ trails , routes.

(iii) It is intellectually involving to the searcher as he/she should know terms


(ii) Homographs are controlled by use of qualifiers enclosed m
used by the author.
parentheses.
(iv) False drops are likely to be experienced as homographs are not
qualified. Homographs are words with the same spelling but different
JExamales
meanings, e.g. mouse, pitch, etc.
• Pitch (Bitumen)
(v) There is a problem of correct spellings and abbreviations, e.g. use of
• Pitch (Football)
American and British English, e.g. color and colotticatalogue and
• Pitch (Music)
catalog and abbreviations such as IT and III.
• Pitch (Slope)

22.5.2 ARTIFICIAL INDEXING LANGUAGE


The purpose of qualifying homographs is to show the context in
Artificial /controlled vocabulary indexing language is an indexing language in
which they are applied.
which the indexer and searcher select from a limited set of terms
(vocabulary) to assign index or search terms.

0 1; 1. Mks*: 11111 715 Ofdrodu 7001


OttrlduritsffpotArtriersi lifermarfaar Trwithwarestal 249 OrPodsor
tira exitb ritinwl of Improwl., nufirang■rionwel 250

(iii) For related Mans scattered in the index, SEE ALSO reference (iii) Through the use of syndetic devices, it displays related terms, therefore
or Broader Term (BT), Narrower Term (NT) and Related Term promoting recall. -
(RT) are used. (iv) It overcomes false drops by qualifying homographs.
(v) It can help in searching multi-lingual indexing systems as preferred
gsamples terms in each indexing language will be provided.
• Disasters SEE ALSO Accidents, Earth quakes, (vi) It enables retrieval of documents whose topics, aren't represented by
Fires terms, in the text but are implied, e.g. a search on interlending will
• Disabled retrieve items on document delivery or interlibrary loan.
NT: Blinds

Disadvantages of artificial indexingtanernine


• Blinds (i) It lacks specificity, for the controlled terms may not specifically
BT: Disabled describe a given document
(ii) It is intellectually involving to the indexer who has to translate the
author's words into the controlled vocabulary.
Because of point (ii) above, wrong assignment of index terms due to
misinterpretation may arise.
The BT, NT and RT are syndetic devices that are used to enhance (iv) The controlled vocabulary tools are usually not immediately up to date.
searching in the index terms, (v) The searcher must team different terms (controlled vocabulary) that may
not be familiar to him/her.
(iv) For all rejected terms, Used For (UF) instruction or SEE (vi) If different vocabulary control lists are used, it hampers exchange of
reference is used. information amongst information centers due to their incompatibility.
Example (vii) It lacks exhaustivity as some terms may be omitted by the controlled
Frogs vocabulary. • •
UF: Rana
22.5.3 FREE TEXT INDEXING LANGUAGE
This means that the accepted index or search terms is frogs. Free text/hybrid indexing language is one in which the indexer assigns index
terms that best suit the subject matter of the item indexed either from
Advantages of artificial Indexing language natural or artificial indexing languages. This language is applied in free text
(i) It promotes consistency in indexing and searching as terms are chosen searching of full text computer databases.
from a controlled vocabulary.
(ii) It eases the burden of searching under different terms as the index terms
are controlled.
Organa,liaa and Retrieval of lajoianarioa: Traialaa mud 251 Organiartaan arid Rein, lad of lafarmation: 252

22.6 COMPARISON BETWEEN NATURAL AND ARTIFICIAL 22.8 FACTORS CONSIDERED IN THE CHOICE OF AN INDEXING
INDEXING LANGUAGES LANGUAGE
VARIABLE NATURAL INDEXING ARTIFICIAL INDEXING In the choice of an indexing language, an information center should consider the
LANGUAGE' LANGUAGE following factors:
1. Assignment of terms 1. Done by the computer 1. Done by the Indexer (i) The users o f the information unit in terms of their information needs
2. Level of false drops 2, High 2. Low and inquiries.
3. Source of terms 3. The document indexed 3. Controlled vocabulary (ii) The information unit's retrieval tools and information services
4, Level of consistency in 4. Low 4. High offered.
indexing (iii) The staff available in terms of their number, level of training and
5. Syndetic devices 5. Not used 5. Used experience.
6. Searching synonyms 6. There's need to type all 6, There is no need for (iv) Financial resources available.
synonym and variant word synonyms and variant word (v) The kind and quality of information or documents to be processed.
forms during the search forms to be typed in the (vi) The language used by other information units of the same kind.
search
7. Level of retrieval precision 7. Low 7. High
8. Intellectual effort 8. Placed on searcher 8. Placed on indexer Qt1VSTIO
l;.1# Ire
9. Application 9, Full text databases 9. Traditional bibliographic
databases and catalogues 1. Highlight SIX qualities of a good iadexing language. (12 marks)
10. Number of search terms 10. Many 10. Few/limited 2. Highlight SIX differences between natural and artificial indexing
si
under which searching is
language. (12 marks)
done 3. Explain FOUR uses of vocabulary control in an indexing system. (8
11, Up-to-datedness 1 I .Up-to-date 11.Not up-to-date marks)
A 4. Explain SIX advantages of free text searching, (12 marks)
5. Under what FIVE circumstances is natural indexing language suiable.
(1.0 marks)
22.7 QUALITIES OF A GOOD INDEXING LANGUAGE
A good indexing language should possess the following qualities:
(i) It must be able to represent the document within the user's interest.
(ii) It must be able to keep upto date.
(iii) It must be clearly and adequately described to ensure consistency in
its application.
(iv) It must provide the degree of detail, accuracy and
comprehensiveness needed.

e r r. Okinda 2001 ®L OklAda 2001


okra, of Information: Training manual
0170trization and & 254
Organization and itturieval of Informal= rralning11701101
253

(vi) Storing the index entries,


UNIT TWENTY THREE
COMPUTER APPLICATIONS IN INDEXING 23.2 COMPUTERIZED. (AUTOMATIC) INDEXING
This type of indexing involves the use of a computer to derive suitable
.indexing jerms, from a textor document in amachinl-readable form. The
OBJECTIVES computer is fed with either the entire text or at least the title and author's
At the end of this unit, the trainee should belible to: abstract The computer automatically indexes by identifying the isdex terms
Define given terms. from the machine - readable text and noting their location in the text and
Establish the uses of computers in indexing. fonnatting the index entries ready for printing or usc. The selection of terms
Describe computer-aided and computerized indexing. t&doriv_ngniust,it Stoplist and Go:list/Thesaurus, The Stop list consists of

Describe keyword Indexing. words that do not necessarily reflect the subject matter of items (e.g.

Explain the advantagesand disadvantages of keyword indexing. articles, prepositions, conjunctions, etc.) while a Go-list includes all terms that

Differentiate between computer-aided and computerized indexing. would be useful as index terms in the subject area being indexed.

Identifieation of index terns by a computer


The computer selects index terms from the machine-readable document
23.1 COMPUTER- ASSISTED (AIDED) INDEXING
through two ways:
Computers can be used to produce indexes in a variety of physical forms, e.g.
(i) Statistical approach/ word- frequency approach (statistical
computer typeset, online print-out, Computer Output on Microfilm (COM), etc.
analysis)
In computer -' assisted indexing, the indexer works through a typescript, a
The computer counts the number of times a word appears in the
printed proof or an already completed_ books selecting index terms. These terms
document. If morepron a specified number or within a given range,
arc then input directly into the computer. The computer automatically carries
then it is selected as the index term and it's location recorded. The
out the necessary manipulation, merging and sorting to produce the final index.
counting-of words is done against a stop list. This approach is based
The computer software available for computer-
on the hypotheses that "the more the times a word is used in a
assisted indexing include; MACREX, INDEX AID, etc.
document, the mote it's likely to indicate it's subject mauer".

Computers assist in the indexing process * through the following ways:


(il) Linguistic approach/Semantic relationships between words The
(i) Alphabetizing the main headings.
computer discerns the meaning of the word. as used by the
(ii) Alphabetizing the subheadings,.
author in the docuttent and subject covered. The index terms are
(iii) Correcting mispelt terms and typographical errors (editing).
selected based on their meaning rather than the number of times they
(iv) Preparing the index for the printer by indicating typefaces,
appear in the document. The compute! relies on
special symbols and indention to the index.
expert/lcnowledge based systems to disce:n the meaning of words.
(v) Producing the final index in such forms as; printed book index,
COM, etc.

LI. Okinia 2001


O 711. Okinda 2001
Orponited•• • at of I•jiwitolleior ?Wahl mania' Organization and Rarriami of Information: Training 'natural 256
255

23.3 TITLE (KEYWORD) INDEXING Example of KWIC index


Title indexing is a form of derived indexing in which the words for indexing are Imperfections In DIAMONDS
Surface energy of DIAMONDS and other stows
Selling gennlue DIAMONDS, textbook on
extracted from the title of the work. There are various computer
programs used in automatic indexing based on titles of documents. There are
two main types of automatic/title indexing techniques:
KWOC INDEXING
• Keyword In Context (KWIC) indexing.
The same procedure of indexing KWIC Indexes is adopted in KWOC
• Keyword Out of Context (KWOC) indexing.
indexes, only that the lead term in is on the left hand margin.
bvd j 1:
(141.1/19C)
KWIC Indexing
KWIC was originally initiated by Andrea Crestadoro as early as 1864 under
KWIT (Keyword in Title). KWIC was initiated by an IBM Engineer, Hans
Examples of KWOC index
Peter Luhn, in 1958. KWIC indexing is based on the assumption that titles DIAMONDS Imperfections In •
DIAMONDS Surf." energy of • nod other 111011011
in scientific and technical works use words that indicate the subject matter or DIAMONDS Selling m elee textbook on

important concepts dealt in the works. In KWIC indexes, the lead terms are
picked from the title of documents automatically using a computer. Once
the index terms are generated, the KWIC software formats, sorts the index Examples of KWOC index

entries and outputs them on a suitable medium (commonly COM). Each ADULTS
A DESCRIPTIVE STUDY OF BLACK AMERICAN .OP ACHIEVEMENT FROM AN
keyword appearing in the title becomes an entry point and is highlighted in FRAMEWORK 0971161
A STUDY OF TILE VISUAL LANGUAGE PROCESSINC ABILITIES OF DRAIN INJURE D .
APRAKIC VERSUS DVS/Mille SUBJECTS
some way. VISUAL SEQUENTIAL RECALL OF ASSOCIATIVE AND NON-ASSOCIATIVEE STI
MULI IN
WHIM REALLY BRAIN-DAMAGED AND NORMAL

ADVANCED
THE. EVOLUTION OF GLOBULARSTARS 045879

Example of KWIC index ADVANTAGE


IC STIMULI
THE RIGHT EAR. FOR THE PROCESSING OF LINGUIST
ASITES/NEW CONCEPTS IN FOODBOURNE BACTERIA VIRUS FUN 105266 •
ESSED CONDUCTION HEATED FOODS/INST COMPUTER DETERMINATION 0 105546
ID AND SUCCINIC-ACID IN FOODS BY INST GAS CHROMATOGRAPHY BEE 102081
ERMINATEON OF DULCIN IN FOODS INST DIALYSIS/STUDIES ON DETE 102210
ON THE FLAVOURS USED IN Source: Rowley, Jennifer (1988): Abstracting and indexing. -2na ed. -
FOODS TYOPIIAGUS-DIMIDIATIS YEAST IN 106332
ERNI MATION OF DULCIN IN FOODS 2 METHOD OF DETERMINATION OF D 102212
FROM INVESTIGATIONS ON FOODSTUFFS AND ARTICLES OF CONSUMPTION 102080 London: Bingley. Pp. 92, 95
RECEPTORS IN THE CATS FOOT/THE NATURE AND LOCATION OF CER 103193

Source: Rowley, Jennifer (1988): Abstracting and indexing - 2"d ed. - 23.4 FEATURES OF KWIC AND KWOC INDEXES

London: Bingley. P. 92. (i) The indexes are based upon keywords in titles of items indexed.
Theretbre, they use the natural indexing language. The keywords are
selected by a computer against a pre-selected stop list or s top word list.
Index entries are prepared in respect of the selected keywords
(ii)The index entries are alphabetically arranged by keywords.

0 T. L Man 2691
li)T. I. Militia 2601
Olontoation
Retriedof loforroothow TholoOrg roesousol
258
Orpsokottioo
and Radom: of Informadom: Thritthrs alfinstal 257

Plsadvaotazca of keyword indexing


(i) It is likely not to collocate related items.
(ii) Incomplete context of titles results as long titles are truncated.
(iii) The indexes cannot accommodate titles of different languages unless
translated
(iv) For exhaustive approach to retrieval another index should be consulted,
as not all bibliographic details Ilating to an item are included in the
keyword index.

(v) Some titles are not indicative, thus they will not reflect the subject
inattecof the item.
(vi) The KWIC indexes that exist are unattractive and tedious to scan due to
their physical form and.upelace (Usually such indexes are on computer
continuous sheet in small typefaces).

23.5 DIFFEAENCES BETWEEN COMPUTER-AIDED AND


COIVIPIJTERIZED INDEXJNC
Advautates of keyword indexing
y a computer and VARIABLE COMPUTER-ASSISTED
(i) leMintellectually involving as it can be done b COIVIECITERISED
picked as they appear in the
even if manually done the index terms are INDEXING
1. Source of index terms P1. i1D)Eerivveedd from a printed
title 1. Derived from maci
nsuming in indexing. format document.
(fi) Because of the above point , it is less time co readable document
(iii) It is suitable for subjects with hard vocabulary , i.e. scientific and document already is
the computer)
technical Subjects.
val, which is commonly used by 2. Choice of index terms 2. Chosen by a human indexer 2. Chosen by the coin
(iv) It facilitates title approach to retrie
3. Consistency in indexing 3. Slightly low
3. Automatically high
users.
4. Indexing languages used 4. Natural, artificial or free text
n the text enhances consistent 4. Mainly natural indat-
(v) The absence of interpretation of words i
indexing languages language
indexing. S. Key role of the computer• 5. Assists in preparing the
' ies that the index uses up to 5. Used for choosing ii
et j' 04) Thense of natural indexing langtsige, impl
index once the index terms terms and preparing
date terminology. are chosen by the index. index.
d vocabulary nt in bighly specialized
(vii)Ille problem of uncontrolle inhere Speed of indexing 6. High
I dir. ) 6. Low
subjectfields is reduced.

Cr. L Oklogo .707

0 T. L Mkt& 200.1
Organization and
'kteval of InfiArmation Training Manual
anon: Traiunke manual 259
Orgonnnfion and nametwl of Inform

UNIT 'TW
ENTY FOUR
REVIE w Qi,rr.sTroNs CITA
'ITION INDEXING
I. Explain SIX advantages of computerized over manual indexing. (12
marks) OBJECTIVES
2. The reliance of automatic indexing on titles alone poses dangers to the At the end of this unit,
:41npereoschedotudrde.be able t o:
index. Explain SIX ways in which keyword indexes may be enhanced to Describe citation intdheex4i
facilitate the choice of index terms. (12 marks) Describe the structure op a
printed citation index.
3, Explain SIX advantages of using titles directly for indexing. (12 Explain the advantages ■
t•k hadx .disadvantages of citation indexes.
marks) Use a printed citation in
4. Highlight SIX ways in which a computer can assist in indexing a given --. —
text. (12 marks)
24.1 INTRODUCTION
5. Describe the structure of KWIC and KWOC indexes. (12 marks)
, Citation indexing is theprosceNte
citation Index is a list of workg adopted in compiling a citation index. A
that have been cited in a given y;.:ir and
underneath each cited item is a,
original work. Ranged the works that have cited that

Examples of citation ikkdexes


• ‘Science
Index.
Citation
• Social Sciences (\
itation Index.
• Arts and Human s.08
Citation Index.
• Shepherds Cited%Inn
Index.
• Genetics Citatiorkk Index.
• Cumulative Index
N C to the Journal of American Statistics
Association.
Citation indexing is based on th,,te fo o ng u
ll wi .ass mptions.
(i) Iftwo documents CC
- the same work, then they are relat ed
0 each other.
giving %,
(ii)A document
citation of the previously published
document Indicates
current and old doen7ubject relationships between the
-Nument.

e T. I. Okindn , 2001

0 T. Okinda 2001
(*visitation doll Nomor11 of inform ton. Training Manual Organization and Retrieval of lnfizrinntion: Training Manual 262
261

A citation index consists of three parts riamely:

r
(i) The citation index: This is the list that gives the names of authors ACT,A,1 y

whose works have been cited by other authors. In this part every 1;x,Inlifte citalion r1 the' rilnlion

published article appears alphabetically according to its author, irulf,,,..40111Tv .1m1 per 'SIIIVWCI (I \

followed by the titles of the articles.


• (ii) The source index: It is the list that gives the citing authors and the

works in which these citations have occurred. In this part, the 24.2 STEPS INVOLVED IN CITATION INDEXING

bibliographic details of every item quoted in the citation index is Citation indexing involves the following steps:

given. (i) Decide on the coverage of the index in terms of the subjects, period,

(iii) The permuted subject index: It is the part in which articles are documents and citations to be covered in the index.

indexed according to significant words in their titles. (ii) Scan the source documents in order to identify the relevant articles or
set of articles identified are checked and also citations included in them
checked. If the citations are incomplete, then the original cited works
should be consulted for further information.
Example to illustrate a citation index
(iii) Prepare the index entries on standard catalogue cards or paper slips for
AUTHOR'S NAME TITLE VOL. PAGES DATE ease of filing. Each entry consists of the item(s) cited and those citing the
SMITHE, A. F. Journal of document. To this entry, the nature of the document (whether abstract,
1964 Experimental Genetics 49 1000
review, book, etc.) is added.
Hasten, J. M. Experimental
Research 21 51 1970 (iv) Sort the index entries according to cited documents to yield a bunch of
Clementine, J.M. Genetics Review 41 600 1909 entries for each cited work.

Dairy, D. J. journal of (v) Consolidate the index entries by arranging details of the citing work
Science 51 71 1968
alphabetically by author and placing them alongside details of the cited
documents.
(vi) Arrange the consolidated entries alphabetically by the authors' of the
In the above example, the cited article is by SiVIITIIE, A. F. whose work
cited works.
appeared in "Journal of Experimental Genetics" vol. 49, 1964, starting front
(vii) If the index is to be printed, then typeset or affix the necessary
page 1000. This article was quoted by Raxton, J. M. in
instructions for the printer.
"Experimental Research", vol. 21, in a work beginning from page 600. The
(viii) To facilitate use and users' convenience, an index of citing or source
other two authors also quoted this work.
documents, an index of topics covered by the source or citing documents
maybe compiled.

C7:1 Okindo 2001


C T.i. Okioda 2001
, Orpanhation and Rerrieval blonnation: Training Manual 264
Organization and Retrieval ofInformaion, rating linnunl 263

(vi) Inconsistencies and inaccuracies in citation practices leads to a


24.3 ADVANTAGES AND DISADVANTAGES OF CITATION
limitation when preparing the index.
INDEXING

Advantages of citation indexes and indexing


(i) There is no intellectual effort in indexing as no subject terms are
assigned.
(ii) Ilecause'of the above point, a citation index does not face.the problems 1. With respect to information retrieval, explain FIVE advantages of

inherentin the use of controlled'vocabulary as in subject indexes. citation indexes. (10 marks)

(iii) Has extensive coverage as the inclusion of a key journal will result in 4 2. Describe the prOcedure for using a printed citation index. (12 marks )
other journals retrieved via the citing ones.

(iv) Citation indexes cover different workS (in terms of language, subjects
!t'
and period) as any document citing a given work is included.
(v) Searching in these indexes is precise and direct. ,W41(4. ,cAg- 1.
44- ,44,144,1172
(vi) There are no limits on length of bibliographies. Therefore, there is no /

Pi P. 41,./1" CI)
01011,F.
limit on the depth or exhaustivity of indexing. /44,1:00'
L., v f (/1/
(vii) Citations can be used across documents irrespective of the language of +71
:1;1$1I

VP 4 • I<
the source or cited document.
(viii) Citation indexes enable users to find core publications, especially
2
periodicals in a subject.
5.0 V:1*
(ix) It reveals the obsoleteness of publiShed works by showing the year the MO I

citing author cited the work.

p-

Pisadvantagcs of citation Indexes and indexing


(i) The indexes do not state the relationship between the citing and cited
(tx 1,- ,'
r.. 1,12`) , ,/,
1/,
U 'ff

documents. 414.4.?.1.-e-.41,)4..,414, 4.).-4.(4,14;, (4.",.,.


(ii) Not all relevant documents for a given subject/work are cited. If this is
i )1
the case, then the index suffers from lack of comprehensiveness. frf)qrger',101,-,!) ,/, ,.i "Iylili 20IP
(iii) It retrieves only related documents but does not provide their location rly•rri •r
and contents,
(iv) The user must know at least one reference 'of a document in order to use
the index.
(v) It's not logical as other subject indexes because it does not collocate
related subject entries!.

C T1. Okinda 2001


Olanda 2001
argankaaion and Retrieval of Infonanaon: Training Manua! 265 Catanirarion and Retrieval ofInforniaon: Training Afanual 266

indexing of items and which facilitates searching by linking entry terms with
UNIT TWENTY descriptors. Therefore, a thesaurus indicates:
(i) Which terms to use in retrieval of information.
FIVE
(ii) Which terms are not supposed to be used indexing and
THESAURUS AND THESAURUS CONSTRUCTION information retrieval and refers the user to the terms accepted for
use.

(iii) Additional terms that may be used in information retrieval.


OBJECTIVES (iv) Relationships between terms.

At the end of this unit, the trainee should be able to


Define given terms.
Explain the purposes of a thesaurus. Examples of thesauri

=> Describe the structure of a printed thesaurus.


Describe the procedure fur constructing a printed thesaurus. • POPLINE Thesaurus for use for POPLINE Database.
• MACRO-THESAURUS for use with the United Nations
Construct a simple printed thesaurus.
Database.
Explain the uses of a thesaurus.
• Thesaurus on Youth.
Compare and contrast a thesaurus and a list of subject headings.
• UNESCO Education Thesaurus.

• INSPEC Thesaurus (this is an engineering thesaurus).


25.1 DEFINITION OF THESAURUS
• Education Resources Information Center (ERIC) Thesaurus.
Various definitions have been advanced for the term "thesaurus":
• Macro thesaurus for information processing in the field of
> A thesaurus is a controlled and dynamic indexing language
economic and social development,
containing semantically and generically related terms, which
• Thesaurus of Psychological index.
comprehensively covers a specific domain of knowledge.
• Thesaurus of Sociological research terms.
A lliosaurus is n limited vocabulary of terms in alphabetical order
• Thesaurus of Computing terms.
tbdt un be wind in indexing and searching. It provides control over
• ILO Thesaurus.
Willem homographs and brings related terms
• The UNESCO: IBE Education Thesaurus.
• Political Science Thesaurus.
tarabulnr in which synonymous,
1.01110griiphic mllaiionships a
mong • The Urban Information Thesaurus: a vocabulary for social

by miondardiind documentation.

be pminr.:, • .1 ... Jr, ,110",

Based on Iti# tiontifibib di IIIMBRO, tinio4 thtl q•-,,.14“44 4


indexing Ituigtiago c-bbittlittd ittitibutitt bit4 #01.7 111

T.1. Muth 2001


F FT
Organize:fie', and &elml of information: TrrtiniAX Manual 268
Organ men and notieval of Information: Raining Manual 267

S IN A lii.ESAURUS
25.2 PURPOSE OF A THESAURUS: 25.3 TYPES OF TERM
The purposes of a thesaurus are (i) Entry terms: These are all the terms that provide entry into the
thesaurus.
(i) To indicate the relationships between terms, concepts or ideas
rred terms: These are terms that describe a
(ii)Descriptors or prefe
about concepts in a given domain of knowledge or subject field,
concept. They are terms which are accepted for use in a thesaurus and
This helps indexers or searchers understand the structure of the
subject field. which an indexer assigns to a document to describe its subject contents.
r non-preferred terms: These are terms that exist in
(iii) Non-descriptors o
(ii)To provide a standard vocabulary for a given subject field so as
accepted for use.
a thesaurus but are not
to achieve consistency .amongst different indexers when
(iv) Specifiers: These are terms that uniquely identify or specify a given
assigning index terms in an information storage and retrieval
tinguishes it from other classes, e.g. tag
system. document's class and dis

(iii) To provide cross references between terms or concepts so as to


welding, Small industry.
to terms that represent corporate bodies,
(v) Identifiers: These refers
control synonyms or unaccepted index terms, qualify
ieties, institutions, firms, geographical names,
government agencies, soc
homographs and guide users to other related terms in an
information storage and retrieval system. In a thesaurus, only etc.

one synonym will be chosen as the index term. For those


synonyms not accepted as index terms a reference is prepared from 25.4 BASIC STRUCT URE OF THESAURUS

them to the synonym accepted for use. Homographs are enriched by


hes0oru
EXonloto of foloho9911call mrt. of t
qualifying them so as to show the context of their application.
FOREIGN TRADE
Syndetic devices or other auxiliary devices are used to show 09.05.01
IEUR / COMERRCIO EXTERIOR
COMMERCE EXTER
OF A SPECIFIC COUNTRY OR REGION
relationships between related terms, SN: FROM TUE POINT OF vrew UF:
Trade relations
L TRADE
(iv) To guide users of the information storage and retrieval system ' BT: INTERNATIONA
ELATIONS
INTERNATIONAL R
to choose the correct search terms to the subject field of inquiry. TRADE
NT. EXPORTS
(v) To locate new concepts in a scheme of relationships with IMPORTS
TRADE
RT: BALANCE OF
existing concepts in a way which makes sense to users of the OLICY
FOREIGN TRADE P
TARIFFS,
information storage and retrieval system. Tiede relations
ADE
USE: FOREIGN TR
(vi) To provide syndetic devices or classified hierarchies so that a
user of the information storage and retrieval system can narrow Sglat
Macro thesaurus for Information processing in the field of economic and social
or broaden his/11er search if too ma ny or few items are retrieved
development • Paris: Unite d Nations Organization for Economic Cooperation and
respectively;
Development, 1991. P. 12.
(vii) To some extent, a thesaurus provides a means by which the use
of terms in a given subject field may be standardized.

TI Okinda 2001
Okinda 2001
Organization and Iteirieunl of InfOrntarfont Training Manual
269 Organization and Retrieval of Information: Droning Manual
270

From the Macro thesaurus example given, we note that entries in a thesaurus can Functions of svndetic devices in a tesaurug
take various forms: (i) USE: It indicates which term is correct for use, e.g. in the example on
• Descriptor: FOREIGN TRADE Macro thesaurus, the term that's correct when assigning index terms to a
• Non-descriptor: Trade relations document on Trade Relations is Foreign Trade.
• Category/class notation: 09: 05: 01 (ii) Used for (UF): It indicates which term has been used earlier in
• Language equivalents: COMMERCE EXTERIEUR / designating the same particular field, e.g. Foreign Trade is used for
COMERRCIO EXTERIOR Trade Relations.
• Scope Notes (SN): FROM THE POINT OF VIEW OF A SPECIFIC (iii) Broader terms (BT) and Narrower Terms (NT) They link terms
COUNTRY OR REGION. SN are used to explain a descriptor so as that are hierarchically related, hence useful in broadening or narrowing
to indicate the context of application. down a search.
• Used For ( UF): FOREIGN TRADE (iv) Related Term (RT): It links terms that are related otherwise but not
UF: Trade relations hierarchically as shown in the example on Macro thesaurus.
UF is used to show the preferred term/descriptor. (v) Scope notes (SN): They indicate the context in which a given

• USE Instruction: Trade relations descriptorfindex term is applied.

USE: FOREIGN TRADE


USE instruction refers a user from a term that is not used to the Format of a thesaurus

preferred term. A thesaurus is basically divided into three main parts:

• Broader Term (BT): ET: INTERNATIONAL TRADE (i) Main part (alphabetical thesaurus): This is the main part of a

INTERNATIONAL RELATIONS thesaurus where all descriptors and non - descriptors (entry terms) are

TRADE shown together with their relationships, notes on their use, subject +

BT refers a user to the more comprehensive term (s). category to which they belong, etc.

• Narrower Term (NT): NT: EXPORTS (ii)Auxiliary parts: They seek to improve access to a thesaurus by

IMPORTS providing other alternative search approaches, e.g, permuted index,


hierarchical index, graphic displays of relationships, etc.

NT refers a user from a comprehensive term to a narrower term. (iii) Classified part: It a listing of descriptors categorized on the basis of
subject relationships into broad fields or groups.
• Related Term (RT): RT: BALANCE OF TRADE
FOREIGN TRADE POLICY
TARIFFS
RT refers a user to other terms that have related meanings.

Okinda 2001 0 O. Oldnda 2001


Ortankaaon pad Retrieval of fnfornolon: Training Manual 272
Organttruion and Retrieval ofinfininationih.atning Afanual 271

• Graphic displays/two-dimensional displays: Displays a set of


25.5 RELATIONSHIP OF TERMS IN A THESAURUS
There are three main relationships between terms in a thesaurus as noted by tenn.s and their interrelationship , e.g. the family tree display
shown below:
RoWley (1988):
t,/ / O f t i )
(i) Preferential relationships 2
These relationships show the pre erred terms or descriptors in a
thesaurus. Thus, they specify which terms to be used and which ones LAND TENURE

not to be used: Such relationships are displayed by the use of OF and


USE devices.
(ii) Hierarchical relationships FARM TENANCY LAND OWNERSHIP

These relationships link together hierarchically related terms in a


thesauruS by using ET and NT devices.
(iii) Affinitive relationships • /IN tif i Iv
hTese relationships link together terms that are related but not in any
• Use of classification schemes: The terms are arranged logically
hierarchy,' They are shown by the RT. according to notations derived from a classification scheme and
assigned to the descriptor—

, 25.6 METHODOLOGY FOR THESAURUS CONSTRUCTION


Thesaurus construction refers to the construction of a controlled vocabulary
to be used as the indexing language. The two:approaches used
simultaneously in thesatirux construction are:,
(i) Analytical approach: This involves analysis of the subject
content(s) of existing literature and selection of preferred terms.
DigPlav of relationships in a thesaurus
(ii)Gestalt approach: In this approach, subject experts analyze
Some of The devices used to display relationships in a thesaurus are:
candidate,terms from secondary sources of information and
• Hierarchical displays: They are commonly a separate section
make a seleetiun of preferredtemis to be included in a thesaurus.
of a thesaurus and display alphabetically hierarchically related
terms.
In the construction of a thesaurus ISO 2788: 1985 ( 1986): Guidelines for
• Categorized displays: They cover particular subject fields. In
)4 azral the establishment and development. of a Multi•lingual thesaurus. and BS
such displays terms are entered under a series of category terms
5723: 1979 (1984): Guidelines for the establishment and development of
as can be seen in Medical Subject Headings (MeSH). V 5
■ Permuted index of compound terms (KWIC listing of monolingual lingual thesaurus may be used in determining the contents,
layout, and methods of construction and maintenance of a thesaurus.
terms): Just as the KWIC index, this index brings every word in
a multi-word term in turn into the access position.

(9 T.1. Okinda 2001 T.I. Okirtda 2001


Organization and Rciriaval of Informatio n: Ttaining Manual 273 Orgemiwriart and lieirie•al of Information: Training Manual
274

Some of the guidelines or principles that may be used for thesaurus


• As far as possible avoid punctuation marks. Use
construction are: punctuation marks in a few cases where it is
(i) All the thesaurus entries must be arranged logically ( that's
unavoidable (e.g. PARENT-CHILD
alphabetically or systematically) in order to facilitate access to the RELATIONSHIPS).
thesaurus.
• Prefer to alphabetize the entry terms word-by-word to
(ii) To control various word forms: letter- by:letter
• Prefer plurals to singulars, where the noun can be
• Some of the words accepted for use in a thesaurus

counted (e.g. ANIMALS, LIBRARIES, BOOKS), but not include; single words (e.g. SHEEP, PITCH), phrases
SEA, SHEEP, LOVE as non-countable nouns or abstract of two or three words comprising of a noun and an
concepts are singular).
adjective (e.g. ELECTRONIC CIRCUITS), two words
• Prefer direct to indirect forms of terms ( e.g. linked by land' or (e.g. FIXTURES AND
INVESTMENT BANKS, not BANKS,
FITTINGS), compound phrases (e.g. GREENWICH
INVESTMENT; UNIVERSITY EDUCATION, not MERIDIAN TIME), names of persons, corporate
EDUCATION, UNIVERSITY; PUBLIC bodies, geographical places, etc. (e.g. KENYA,
LIBRARIES, not LIBRARIES, PUBLIC). KENYA LIBRARY ASSOCIATION)
However, prepare references from direct to indirect forms
• Terms should represent simple or unitary concepts as
of headings. (e.g. BANKS, INVESTMENT USE:
far as possible, and compound/composite terms or
INVESTMENT BANKS; INVESTMENT BANKS UF:
phrases should be factored into simpler elements
RANKS, INVESTMENT)
except when this is likely to affect a user's
• Use nouns and noun phrases: adverbs and verbs should understanding (e.g. AIRCRAFT ENGINEES is
not be used by their own but in conjunction with nouns factored into AIRCRAFT and ENGINEES).
(e.g. ELECTRONIC CIRCUITS not ELECTRONIC on
it's own). Factors considered before starting to construct a thesaurus
Abbreviations, initials and acronyms should be written The following factors/ issues should be considered before starting to
in Ml unless they are internationally recognized (e.g. construct a given thesaurus;
I (i) The subject field to be covered should be clearly defined.
Hortintsphs and homonyms should be clarified by (ii) The terms to be selected for inclusion in thesaurus and relationships
unalifiars in pillalitheses after the term concerned (e.g. between them.
MOW* Itatapiasto), PITCH (Music). (iii) Display of relationships amongst terms.
I Ilia g Wham iit sitiudatd Npalling, e.g. in Kenya, (iv) Procedure for updating the thesaurus .
Nth W ish is 'mad (v) Construction techniques.
(vi) Auxiliary precision devices to be incorporated in the thesaurus.

—1ITT Ohiala iaoi © T1. Mifflin 2001


OciontzatIan and Rorieval of InformaliOn: Trabsing Mamma 275 Ogail&adon and Retrieval of Aformation: Trafiung Hamad 276

(iii) Exhaustivity of the indexing language.

Stens involved in thesaurus construction (iv) Level of pre-coordination of terms.

Rowley (1988) gives three main steps involved in thesaurus construction: (v) Structure of the thesaurus.

STEP 2: Establishing the purpose, and functions of the thesaurus STEP 3: Compilation procedure

The purpose of a thesaurus will be dictated by several. factors: - The compilation of a printed thesaurus involves several key steps as outlined

(i) Subject field to be covered in terms of topical coverage and extent to here:

which aspects of the subject should be covered. This is important. Defining,the stibject field to be covered by identifying the main subject

because the core or main subject will be covered in depth as opposed to area and related/marginal subjects. The main subject area should be dealt
marginal subjects. depth as opposed to marginal subjects.

(ii)Type,of literature to be indexed using the thesaurus, e.g. books


demand less specific indexing than periodicals. Therefore, a thesaurus (II)Establishing the basic structural divisions of the subject field by
for periodicals' indexing will be more detailed than one for books' ■ breaking it,into major facets or groups or finer divisions where necessary.
indexing.
(hi) Quantity of literature to. be indexed using the thesaurus. If more
items are to be indexed using the thesaurus, then it should be detailed (III) Selection of terms to be included in the thesaurus including
to enhance specificity in indexing. synonyms, related terms and variant word forms. The selection of terms
(iv) Type of information storage system in use ( that's whether pre- or can be done intellectually by a human being or by using a computer.
post- coordinate indexing system, manual or online searching Selection of terms to be included in a thesaurus involves:
system). (i), Collection of terms: The terms can be collected by:
(v) Aids and resources, available for thesaurus construction (that's other • Manually searching in existing thesauri,
thesauri, staff and funds). classification schemes, name authority files,
(vi) Users of the thesaurus and information retrieval system in respect of ,treatises on subject field terminology,
their nature, type and frequency of their questions. encyclopedias and lexicons, dictionaries,
(vii) Type of thesaurus to be compiled, whether monolingual or indexes of journals and abstract journals, other
multilingual. published indexes, handbooks, textbooks,
‘4% Acfp lh cata1 4ies, nomenclatures of single
STEP 2: Decide on characteristics of the thesaurus disciplines, etc.
The characteristics of a thesaurus will be ascertained by considering the • , Conducting literature scanning for terms by
following factors: examining titles, conclusions of documents.
(i) Type of indexing language to use in the thesaurus (that's natural, • Question searching from users' and search or
controlled, or free text indexing language). users' records.
(ii) Specificity of the indexing language.
• consulting subject experts.

Okinda 2001 mTl.Okinda 2001


Organt,nion and Retrieval /Information: Training Manual 277 Orgarthation and Reviavai of Inforotation: Training Mama!
278

• Relying on the individual knowledge and • Build notations (syndetic devices) under each term affected by
experience of the person compiling the each decision.
thesaurus.
(ii)Verification of the authenticity of selected (V) Check relationships to be included for each term by examining each of
terms/descriptors by consulting dictionaries, standardized the recorded entries so is to develop cross references and auxiliary
vocabulary, current use of terms in literature and the devices.
opinion of subject experts.
(iii) Evaluation of the utility of the candidate descriptor by (VI) Finalizing the thesaurus by undertaking the following:
considering: (a) Checking and reviewing
• The frequency of descriptor in current • Terms for consistency and pre-coordination level,
literature. word form and specificity level.
• The effectiveness of the descriptor in connoting • Classificatory indicators.
a given concept. • Links between displays.
'7 • Authenticity of the descriptor as current • Converting all listings in their final form ready for
terminology in discipline. typing.
• Relationship of the descriptor to those (b) Writing thesaurus introduction that should state its features,
descriptors already accepted for use in the reference structure, filing order, use procedure, updating
thesaurus. methods etc.
• Anticipated use of that deAcriptor in retrieval (c) Editing the thesaurus by checking for notations, related
inquiries. terms, spelling, alphabetization, indention and spacing,
Ice or terms: Descriptors should be selected for underlining, etc.
in thesaurus on the basis of their effectiveness in (d) Testing the thesaurus by using it to index at least 1000
ation retrieval and indexing. queries or items.
(e) Production of the thesaurus (that is printing and binding).

the computer or searchable cards.


Ith synonyms, near synonyms, (VII) Updating the thesaurus (thesaurus maintenance): This is done

0 of term, SN. To record on through various mechanisms:

low the following steps: (a) Periodic verification of the frequency by which

included in a descriptors are utilized in indexing and retrieval. It


also entails checking descriptors chosen do not

ad from duplicate one another and relationships amongst them


are Okay. If not the descriptors and relationships should
be eliminated.

CT.!. Okinda 2001


07=f:idiom and Retrievui of Information: TruinME Manual
Organaalion and Retrieval cibuiNvettfon; Trvaning Alma! 279

(b) Elimination of descriptors. (vi) Facilitating the conduct of generic searches.

(c) Choice. of a new:descriptor: This commonly occurs MO Automatic generation of cross-references for printed

when during indexing or retrieval it's discovered that indexes.

eoncepts or their relationships.have not been


established precisely in the thesaurus. 25.8 USES OF A' IRESAURUS IN ORGANIZATION AND

ub -division of an existing descriptors: It occurs if too RETRIEVAL OF INFORMATION


formation
many documents are indexed at the same A thesaurus is an important tool in the following areas of in

descriptor. This implies that the, specificity of the organization and retrieval:

descriptor has beeplost hence need to subdivide it.


(c) Changes in defmitions and use uf descriptors. (I) Information searching and retrieval

(f) Additiorror deletion of a hierarchical relationship or A thesaurus facilitates information retrieval by:

syndetic device. (i) Assisting the user select the most suitable terms to use for
searching the information storage and retrieval system. A.

25.7 THE COMPUTER AND 'THESAURUS thesaurus is a guide for users as it enables them choose the

Computers assist in thesaurus construction by: correctterm for a subject search.

(i) Alphabetizing the entry terms and sub-entries. (ii) Mapping the user's vocabulary into the controlled vocabul ary of

(ii) Correcting mispelt terms and typographical errors. the information storage and retrieval system.

(iii) Preparing the thesaurus for the printer by indicating (iii) Assisting the user narrow or broaden down his/her strategy by

typefaces, special symbols and indention to the index. use of the cross references.

(iv) Producing the final thesaurus in such forms as; printed (iv) Facilitating consistency in searching through vocabulary c ontrol

book index, COM, etc. mechanisms.

(v) Storing the thesaurus entries. (v) Controlling the vocabulary of the information storage and

(vi) Preparing cross references. retrieval, system, thereby reducing the size of the system. Th us,
the searcher has few entries to go through in the system.

The role of a computer in thesaurus construction, maintenance and use can be (vi),Assisting the userunderstand the structure of a subject field as

noted from Lancaster's (1977) functionptef a computer-held or machine- „ it shows relationships between concepts or ideas about concepts.

readable thesaurus: ,.(vii) Indicating additional terms that may be used in information

(i) Checking for the c'onsistency and acceptability of terms. , retrieval..

(ii) Maintaining statistics on the use of terms, Saving the user's time in information retrieval as the user is
(iii) Maintaining tracings (that's deletion of a term with it's referred to the accepted search terms from the thesaurus.

associated records). L(ix) Reducing false drops in information retrieval as homographs

(iv) Maintaining the term's history. ,ere qualified.

(v) Automatic optimization of a searching strategy.

Okimirt 2001
CT.!. &hula 2001
Organization and Retrieval of Information: Training Manual Organization!tatf Rotriemi of information: Traini„
281

now(
282

(11) Subject indexing and cataloguing of information materials A (iii) Assisting the classifier understand the eorkt

thesaurus facilitates subject indexing and cataloguing by: terms through the SN.
\xtual application of
(i) Prescribing what index term(s) or subject heading(s) should be (iv) Assisting the classifiers and classifictionit
assigned to a document through the USE device. structure of a subject field as it shows re, . understand
4ic the
(ii)Suggesting other index term(s) or subject heading(s) to be concepts or ideas abiutt concepts.
nships between
considered in subject indexing or cataloguing through the use of RT, (v) Acting as an index to the classification 80\e,
BT and NT devices. (vi) Permitting access to the classification selik \ne.

(iii) Assisting the indexers and cataloguers prepare cross references term.
e via a specific
in the subject index and catalogue.
(iv) Facilitating consistency in subject indexing and cataloguing 25.9 FACTORS CONSIDERED IN THE CHOICE

amongst different indexers and searchers. There are many published thesauri for use in indexing
Ail A THESAURUS
(v) Controlling the vocabulary of the subject index and catalogue, Therefore, the information professional should considerbrmation materials.
thereby reducing the size of the index and catalogue, factors: e\h following
(vi) Assisting the indexer and cataloguer understand the structure of (i) The authority of the thesaurus by establishili

a subject field as it shows relationships between concepts or the publishers and compilers of the thes4nr
k, the authority of
ideas about concepts.
(ii)Practical experience and extent of use or t,‘
(vii) Saving the indexer's and cataloguer's time in subject indexing thesaurus should be well established (tiler_ s ‘e thesaurus: The
and cataloguing by making readily available the source or and should come in both printed and cotb,..
ed for a long time)
vocabulary from which to derive index or catalogue terms. (iii) Revision: The thesaurus should be cont4ikkrized formats.
(viii) Providing the vocabulary needed for constructing subject revised after given stated intervals. (.1sly updated or
authority files and lists of subject headings, (iv) Format: The typographical layout, reititiclis
should be clear.
(ix) Enriching the process of keyword Indexing as it can be used by \ ip indicators
the computer to select index terms from the machine-readable (v) The type of items to be indexed using the tile

item. (vi) The subject field to be indexed.


tturus.

(III) 1.11)rHry rlilirylflcxllun 25.9 QUALITIES OF A GOOD THESAURUS A

A th,•.311111: tar dilates 111,1:u layillication by: good thesaurus should:

1. that can be used for arranging the (i) Be continuously updated or revised after gix
ve,
lit t . I nrur will have a class notation attached (ii) Have clear typographical layout, relationsiip'l
slated intervals,
(iii) Adequately show the relationships betweek
;i .. 1
tindicators, etc.
111 11
, (iv) ,Be exhaustive in its coverage. slated terms.
Ihrough the use of BT
rithl N I itl4hott4hIpi (v) Be easy to use (that's should have a deqijat

and should be well guided). 4tructions for use

104f ill
ort Okinda 2001
Organiranoa and Reoleval of Infanation..3) Wain Manual 283 thipmunim and litnnowl of infonnation: Training Manua! 284

(vqProVide for alternative search approaches/aw


dliary see and see also references that can be transferred to the retrieval
tools,
'devices/recall and precision devices.
(vii) HaVe a Systentatie arrangement of entries. (vii) Additional displays of terms (e.g. categorized displays,
(permuted index of terms, etc.) are used in a thesaurus as opposed
nships
(viii) Accurately arid precisely indicate and display relatio
between termS. relations • to a list of subject headings.

THESAURUS SAMPLE THESAURUS ENTRIES ' '


25.9 CONTRAST AND COMPARISON BETWEEN A
AND LIST OF SUBJECT HEADINGS TEACHING
s:
A thesaurus and list of subject headings are similar in two way
SN The art and method of teaching
(i) They control the Use and fel= of index terms. OF Instruction
BT EDUCATION
(ii)They indicate thettlationthips between terms in an indexing -SCHOOL ADMINISTRATION
RT TEACHERS
language. NT LECTURES AND LECTURING
PROJECTMETHOD OF TEACHING
STUDY SKILLS
gs are noted TEACHING TEAMS
The differences between a thesaurus and a list of subject headin by
Rowley (1988) as follows:
MICROCOMPUTERS
(it) A thesatituS contains more specific terms than a list of subject IN Personal Computers
Home Computers
headings.
BT COMPUTERS
(ii) A thesaurus uses, direct form of headings (e.g. ORGANIC NT LAPTOP COMPUTERS
RT SOFTWARE
CHEMISTRY; ACADEMIC LIBRARIES) wh ile a list of
subject headings that uses inverted form of headings (e.g.. COUNSELORS
• CHEMISTRY, ORGANIC.,•LIBRARIES, ACADEMIC). ( OF Guidance Counselors
Guidance Workers
(iii) A thesaurus does not'subdivide heading while a list of su bject NT ELEMENTARY SCHOOL COUNSELLORS
READJUSTMENT COUNSELLING
headings allows for subdivision of headings according various SECONDARY SCHOOL COUNSELLORS
criteria (e.g. EDUCATION-ENCYCLOPEDIAS).
SPECIAL EDUCATION COUNSELLORS
ST GUIDANCE PERSONNEL
(iv) A thesaurus has extensive indication and display of RT ADVENT COUNSELLING
COUNSELLING
relationships betWeeriterint ecattpared to a list f subject COUNSELLOR ACCEPTANCE'
COUNSELLOR CHARACTERS
headings.

(v) A thesahrus uses the syndetic devices; BT, N T, RT as opposed


to some lists 'Of subject headings that commonly use see and see
also references.
(vi) The cross referenceS (syndetic devices) in a thesaurus ca nnot be
directly' transferred in the retrieval tools as is the case with the

T.I. Okinda 2001 °kind', 2001


Organization and *Mewl of information: Training Manual 285 Orval:anon and Retrieval of Informalion: Training Manual
286

Permuted Index of Compound Terms (KWOC INDEX) 9,


VA
LIFE
LIFE -15.02.02
LIFE EXPECTANCY - 14.06.00
I. Highlight SIX uses of a thesaurus in facilitating vocabulary control in
LIFE INSURANCE - 11.02.03
LIFE SCIENCES - 15.01.01 an indexing system. (12 marks)
LIFE SYSTLES 05.03.01
LIFE TABLES - 14.06.00 2. Describe the procedure for using a printed thesaurus in retrieval of
PARTICIPATION IN CULTURAL LIFE - 05.02.02 information in a library, (12 marks)
PRODUCT LIFE - 12.08.02
QUALITY OF LIFE - 05.03.01 3. Highlight FOUR reasons for updating a thesaurus. (12 marks)
WORKING LIFE - 13.03.03

et.p?,,ott.t1.
rxttAatig..

Olanda 2001 or!. Oldnda 2001

You might also like