4.0Dictionary
making,General nature:
The work on the compilation of a dictionary from the beginning to
the final printing may be divided into the following three phases,
each phase having different steps:
(1) Preparation,
(2) Editing,
(3) Preparation of the Press copy.
(1)
Preparation: this phase includes the planning of the dictionary, the
collection of the material and the selection of entries of the dictionary.
(2)
Editing: this phase involves the setting of entry. The work includes
fixation of the head word, its pronunciation, grammatical characteristics
and the fixation and selection of definitions etc. of the head word.
(3)
The third phase i.e. the phase for preparation of the press copy involves
arrangement of entries, the use of notations and preparing an introduction
for the dictionary, which includes general features of the dictionary,
guide to pronunciation etc.
But
these phases are not strict divisions of work. They are not exclusive
to each other. As a matter of fact, the lexicographer is faced with
problems, relating to all the phases, except the final one, at all
phases. There is a lot of repetition. Let us take the second step
of the first phase viz. the collection of data. Although this is the
beginning of the work and logically once it is over work on the other
steps and phases begins and the collection is stopped. But this cannot
happen in actual practice1. During the course of the preparation of
the dictionary, which takes years, many new texts may appear in the
language and many new words may be added to the lexical stock of the
language. Many lexical units may acquire new shades of meanings. It
may also happen that while scrutinizing the data some new information
hithertofore not available is found out. Some of the lexical units
which might have occurred as only occasional and ephemeral at the
beginning of the work might have stabilized in the meantime. Some
of the meanings appearing as more nuances might become quite regular
and be systematized in the language. In order to make the dictionary
up-to-date all these facts must be taken note of. Thus the collection
of data does not stop at the first phase.
The
setting of entries is closely linked with the selection of entries.
While writing the entries the different types of lexical units to
be included are scrutinized on the basis of their form and meaning
and then a final selection is made.
The
notations, although they come in the third phase, are required in
the second phase also, because while editing the entries the lexicographer
has to use them for separation of meanings and submeanings. As a matter
of fact, the use of notations and the format should be decided tentatively
at the stage of planning itself. Because they provide guidelines for
other stages also.
4.1 Planning: Dictionary making is
a long, complex and time consuming activity. The preparation of dictionaries
takes several years. The following table would show how lengthy is
the process of dictionary making:
Name
of the dictionary
Year of
Year of
beginning completion
Oxford
English Dictionary
1888
1928
Tamil
Lexicon
1913
1938
Malayalam
Lexicon 1953 4 volumes appeared so far
Sanskrit
Dictionary (Poona) 1952 2 volumes appeared. The re-edition of Webster's
III took 757 editorial years and cost 3.5 million dollars.
As
the work involved is stupendous, it is necessary that a detailed planning
is done before the work begins. Some of the basic issues crucial for
planning the work on a dictionary are discussed below:
The
first point to be considered is about the type of the dictionary.
The work on dictionary differs according to the types of dictionaries.
The list of words in a reference dictionary is different form the
one in a learner's dictionary. The dialect dictionaries contain different
type of word list form an academic or normative dictionary. The word
list in a special dictionary is governed by the special purpose or
restrictedness of the dictionary.
The
word list in a concise dictionary is much smaller than the word list
of an unabridged dictionary.
Next,
the lexicographer should decide about the language of the dictionary.
As a matter of fact, the method of collection differs from language
to language. For this the variations in the language are to be considered.
If there are many dialectal variations, the dictionary maker has to
decide whether all the dialectal forms are to be included in a dictionary
or only a few of them. For example, for a dictionary of Hindi, it
is to be decided as to whether the lexical units from Braj, Awadhi
etc. are being entered in the dictionary or not.
Another
point to be considered is whether the dictionary is based on purely
contemporary material of the language or does it plan to incorporate
earlier literature also. Idioms and proverbs represent an earlier
and older stage of the language. Ordinary speakers quote from the
old texts. So would it be desirable for a dictionary of Hindi to include
lexical units from earlier writers like Kabir, Tulasi, Bihari etc?
Although they do not form the lexical stock of the contemporary language,
they are at times needed by some general readers, especially by students
for comprehending texts in the language.
The
lexicographer has to consider whether the language has a diglossic
situation e.g. Tamil and Bengali. If there is diglossia which variety
is to be included in the dictionary or is the lexicographer going
to include both the varieties?
The
social and stylistic variations of the language are also considered
by the lexicographer. Whether the dictionary aims at presenting all
the professional registers, all the slangs, jargonisms and vulgarisms
etc. The inclusion of all this is very difficult if not impossible.
So the lexicographer has to decide as to how much of it is to be given
in the dictionary.
These
decisions should be taken before starting the actual work on the dictionary
and be strictly adhered to. There is practically no scope of making
any large scale changes in the basic format of the dictionary at a
later stage when the work has made some progress. Suppose, it is decided
not to include slangs and vulgarism in the beginning but later on
the decision is revised the lexicographer would have to go back again
and start the work almost afresh.
All
these decisions must be recorded so that when new hands join the project
there is no difficulty in following the line of work. The instructions
must be complete with minutest details. In order to do this it would
be useful if a blue print is prepared for the project. This blue print
may contain description and instructions regarding the following:
Collection
of material (the sources may be mentioned), preparation and filling
up of cards (with sample cards), the compilation of word list, the
structure of the dictionary entry, description and definition of meaning
(their order etc.), labels, phraseology, illustrations, grammatical
characteristics of words, script and pronunciation etc. For all this,
the actual examples could be given. This blue print or project might
also contain a few sample entries for the dictionary. The entries
may be subject to certain modifications, but they will provide basic
guidelines for those working in the project. Again, these entries
should be as varied as possible so that different types of lexical
items are given in them.
Besides
these details, the project or blue print should contain the scope
of the dictionary, its purpose and the readership, the range of coverage,
etc. The preparation of such blue print will not only help as a guide
book for the compilers but can also be used to prepare the introduction
of the dictionary.
4.2 Collection of Material: the collection
of data differs for different types of dictionaries. For languages
which have written literature, the material is collected form written
texts. For unwritten languages the word list is to be collected by
field method from the spoken form.
Collection
of data for languages with written literature: the work of the collection
of data for such languages has two basic components which should kept
in view by the lexicographer:
(1)
the source from which the material is collected.
(2) The method of collection.
The
nature of the source material differs for different types of dictionaries.
4.2.1
Historical dictionaries:For a historical dictionary the collection
is done from the available representative texts of the language from
the earliest period of their availability to the present time. The
lexicographer should examine the data for historical dictionary keeping
in view 'the evidential value of data' from the following (Kelkar
1973).
(1)
ancestral language, (2) cognate languages, (3) descendent languages,
(4) donor or recipient languages, (5) substratum and superstratum
languages .
The
sources for a dictionary of frequency county may be determined by
any criterion. There may be frequency country of journals and newspapers
only, or of general literature or of general scientific texts.
The
source material for learners dictionary may be based on frequency
-dictionaries. Words may also be collected form contemporary literature
and available dictionaries of the basic words.
The
material for children's dictionaries is collected from the text books.
Writings, answer scripts, note books and compositions etc. of the
students should be studied and the words used by children can be tabulated
with the frequencies of the use of words. These frequencies can be
used in addition to the general basic vocabularies.
4.2.2.
For dictionaries of written languages:The source material for
a normative dictionary may be different from that of a reference dictionary.
For a bilingual dictionary generally an existing monolingual dictionary
is taken as a source material.
For
a normative dictionary the material may be extracted from the following
sources:
The
material used for giving the actual context with reference to the
place o f occurrence to authenticate the meaning and usage of the
lexical unit. This includes creative works of literary writers, texts
on technical and scientific subjects as also works of other branches
of human knowledge like history, philosophy, logic etc. The reference
sources consist of texts of different nature like, rule books, orders,
notices, manuals etc. This helps in finding more varied usages of
lexical units.
Besides
these, articles, sketches, and other types of texts from journals
and newspapers can be used as source material. They provide specimens
of the contemporary language. They are good sources for extracting
words and phrases newly introduced in the language or words or phrases
used in new senses. Special terms introduced in the language are sometimes
found in such sources. The language of mass media like Radio, Television
etc. may also be utilized for collection of the material for a normative
dictionary.
For
reference dictionary the sources are a little more varied. Since the
focus of this dictionary is not only the standard language but also
the regional and social variations of it, such dictionaries may also
use oral literature which has very marginal role in a normative dictionary.
The extraction is done from different types of oral literature to
add variety to the lexical units used in a dictionary. So different
types of discourses, e.g. narrations, eye witness accounts, conversations,
arguments, dialogues etc. can be used as source material for these
dictionaries.
Besides
the above source materials, some dictionaries, for example Malayalam
lexicon, also utilize some other recorded materials like inscriptions
and manuscripts for collection of material. Collection of data from
such sources involves the problem of textual criticism, decipherment
of older scripts and inscriptions. Abstraction of lexical units poses
the problem of segmentation from the recorded continuum of graphemes.
If
there are some dictionaries in the language these could also be used
for collection of lexical items. In many cases the dictionaries can
provide some additional senses of lexical items which are not otherwise
available in the corpus of the dictionary. But for this the lexicographer
should be very careful lest he gives many words and meanings in his
dictionary which are not used in the language. For example ta has
15 meanings in a Hindi Dictionary. go has 18 meanings. Again many
a lexical units in a particular dictionary might have gone out of
use. A lexicographer should be careful in examining such cases.
But
the collection of data from all the above sources may not be enough
for a dictionary. For bigger dictionaries there are usually advisory
boards consisting of experts on different branches of human knowledge.
These experts not only provide terms special to their discipline but
also help at a later stage to give definitions to these terms. Even
common people could be associated in providing material for dictionaries.
An appeal by Fowler Brothers for COD had received quite commendable
response. Many of the illustrative quotations numbering nearly two
millions in OED were supplied by any army of more than thirteen hundred
contributors (Whitaker 1966, 31).
A
reference dictionary, which aims at presenting regional or other variations,
should include in their staff or advisory board, persons who could
provide material for all such variations.
The
lexicographer, if he is the native speaker of the language, could
himself provide a lot of information. He may construct his own examples
in order to disambiguate the polysemy of certain lexical items.
Another
type of texts which could be extensively used by all dictionaries,
especially the bilingual ones, are translations from different languages.
These translations provide new technical terms and other types of
words related to the life and culture of the people of the source
language.
The
collection of data for a dictionary is done by the method of extraction.
A single lexical unit extracted on one card with its full context
which is adequate enough to express the meaning of the lexical unit
clearly and unambiguously. The two basic qualities of a good context
are that it should be short and clear and unambiguous. The shortness
of the context is conditioned by the practical problem of space in
a dictionary. A concise dictionary can ill-afford to provide space
for every lengthy contexts. But the shortness of the context should
not be achieved at the cost of the clarity.
As
for the cards, they are prepared for this purpose keeping in view
the volume of the information to be given with each lexical unit.
Space is marked, sometimes printed also, for each type of information.
A typical card for a Kannada-Kannada-English Dictionary is given below:
___________________________________________________________
Spelling in Kannada Script
Meaning in English
______________________
____________________________
______________________
_____________________________
Pronunciation in IPA or
Meaning in Kannada
Roman Transliteration
______________________
_____________________________
______________________
____________________________
Grammatical
Category
Reference
.
This
is a sample card. But the space on the card and the different information
on it vary from dictionary to dictionary. Some dictionaries may provide
space for etymology, synonyms and antonyms also. Sometimes a dictionary
project may use cards of different colours for recording different
types of information. The Etymological Dictionary of Telugu uses cards
of different colours for recording words form different sources e.g.
Sanskrit, Dravidian, Desi etc.
It
will be useful if the cards are numbered. This will help in knowing
the number of extractions made.
The
collection of data on cards takes several years. The information collected
is very useful form different points of view. Their utility is not
lost as soon as the work on the dictionary is over. The cards should
not only be preserved till the final printing is over, as some of
the cards may have to be referred to even at the stage of printing
but should be preserved for further work also. Such collection of
cards called lexicographical archieves or scriptorium are valuable
assets of any language. Bigger dictionaries have very large number
of cards. Malayalam Lexicon has twenty-eight lakhs slips. Nancy France
has four hundred million slips. The scriptorium of the Sanskrit Dictionary
(Poona) is not merely the source of the dictionary, it is the store
house of information on Indian life. It is the repository of various
branches of knowledge. Dictionaries of different types e.g. dictionary
of phrases and idioms, collegiate dictionaries, dictionaries of synonyms
and of antonyms, can be prepared form these cards. Besides providing
material for preparation of dictionaries the cards can also be used
as sources for many cultural information.
The
context of the lexical units, called lexicographical context, may
vary according to the nature of the lexical unit and its usage. Sometimes
even very short contexts may be adequate to give the meaning of the
lexical unit. But some times, it may be a full stanza. From this point
of view the contexts may be of the following types: -
(1)
It may be a word or a single lexical unit, e.g. Hindi acchaa 'so',
'yes' Sanskrit gaccha 'go' (Imperative second person singular).
(2)
It may be a phrase or a sentence. Isliye log usase ghr?n?aa karne
lage. 'So people started hating him', a new hired hand, the head of
the firm etc.
(3)
It may be a full stanza or even a collection of sentences e.g. Sanskrit
akr?ta. Adj. 1A. VIII. 'not done or prepared (specifically somebody)'.
akr?ta kaaritaam bhiks?aam manasaanaanumoditam gr?hyataam vidhinaa
yuktaam tapah? pus?yati yoginaam. Padm. P. (Ra.) 92.48.
aks?atayoni.
adj. '(a woman) who is not deflowered' saa cedaks?atayonih? syaad
gataa pratyaagataapi vaa paunarbhaveina bhartraa saa punah sam?skaaramarhati.
Maha. XIII. 314. 3
Hindi
yakiin karnaa : 'to be convinced'
muniim
ne niicaa sir kiye hue kahaa, paaNc ser duudh aur paav bhar jalebii
kii rasiid. samiti vale yakiin nahi"i"N kareNge ki vakiil
pakkaa savaa paaNc serd?aal gayc sak kareNge ki muniim pacaa gayaa
(DabepaaNv. 5) (quoted from Bahl 1974, 117)
Sometimes,
it may take a full paragraph to give a full context especially when
the lexical unit has some cultural significance.
As
we can see the extraction is done for full collocations which give
clear and unambiguous meanings. A word is extracted in all its possible
contexts. The occurrence of a word in different contexts in the same
sense should not deter the lexicographer from the collecting more
extracts for the lexical unit. It is likely that a new meaning is
available in a further extraction. Again, even if many cards have
the same contexts i.e. give the same meaning these should be preserved.
It is quite possible that a particular context may at the last moment
bring forth the meaning more clearly.
The
lexicographer usually makes two inferences on the basis of the lexicographic
cards.
(1)
An inference is made regarding the contextual sense of the word. A
meaning is tentatively fixed for the word from the first extract.
It is later on verified to its appropriate meanings taken out of all
the possible contexts for example, form the sentence:
vah
ghar meN rehtaa hE. 'he lives in the house' a tentative meaning 'house'
for ghar is arrived at. Then the following contexts are verified:
(1)
vah bar?e ghar kii bet?ii hE. 'She is a daughter (or girl) of high
family'.
(2)
Mere kurte meN bat?an ke liye ghar banaanaa hE. 'A hole is to be made
for buttons in my kurta'.
(3)
Is makaan meN caar bar?e aur do chot?e ghar hEN. 'There are four big
and two small rooms in this house.
From
sentence (1) the meaning 'family' is determined, from sentence (2)
the meaning 'hole' and from (3) 'room'. The lexicographer puts all
these meanings for ghar in his dictionary. As the meanings are related
they are treated as the multiple meanings of the same word. We may
take another example. From the following contexts the lexicographer
finds out the different meanings of chair:
(1)
he sat on a chair,
(2) the chair of philosophy,
(3) he will chair the meeting,
(4) he was condemned to the chair.
as
(1)
separate movable seat for one person,
(2) position of Professor,
(3) to preside,
(4) electric chair for death.]
But
let us compare the following contexts in which the word aam occurs:
Aam
cunaav meNkaangres kii jiit ho gaii
'Congress
won in the general elections'.
Banaaras
kaa aam bahut miit?haa hotaa hE. 'The mango from Banaras is very sweet'.
Here
the meanings (1) 'general' and (2) 'mango' are not related. Wherever
the meanings do not appear to be related the lexicographer treats
them as separate words.
(2)
The lexicographer can make some abstractions of the canonical form
which is to be set up as the head word.
While
making the extractions the following points must be kept in view:
The
lexicographer should be careful about depleted, incomplete and ambiguous
contexts. Such contexts do not give full and clear meaning of the
lexical unit. Contexts like you cannot live on moon are no good. Here
the meaning of the phrase live on is ambiguous. It can be interpreted
both as (1) 'reside' and (2), 'sustain'. Similarly ring in He gave
me a ring may mean both 'a metal ring' and a 'telephone call'. In
Hindi siitaa gaanewaalii hE the meaning of gaanewaalii is not clear.
It may mean both 'Sita is to sing' and 'Sita is a singer'. Similarly
in Raamko kurtaa acchaa nahu_N lagtaa has two meanings 'does not like'
and 'does not suit'. The lexicographer has to examine such contexts
and collect only those which are self sufficient to determine the
meaning of the lexical unit.
For
function words, attempt should be made to collect extracts to the
maximum. Many of the complex and diverse uses of such lexical units
may not be easily available if the extracts are few.
Closely
related to this is the question of the nature of extractions. What
type of extraction should be done from different sources? Although
an ideal situation would be to extract data from all works, there
is the practical difficulty of dealing with enormous amount of data.
So the extraction has to be selective at some stage. Form this point
of view, extractions can be of two types: -
(1)
General and (2) Special.
(1)
General Extraction: when the lexical units of general nature are extracted
it is general extraction. Lexical units although belonging to some
definite thematic groups are extracted for these general meanings.
for example, from stories and articles on theme of hunting one may
collect lexical units related to the field of hunting. From a general
work on sky and sea words related to sky water, climate etc., may
be extracted without any specification of the technical meaning or
explanation of the word. for example from a book on hunting words
like H. machaan 'a raised platform' (for shooting wild animals)' kheddaa
'Khedda' can be extracted. Similarly words like water, tide, wave
etc. may be extracted from a book on sea.
(2)
Special Extraction: this is done for finding out the special technical
meanings of words belonging to any subject field. For example from
a book on general linguistics, one may get a detailed list of linguistic
terms in their special meanings. similarly, from a book like, 'The
Language of Kabir' words for philosophical terms used by Kabir may
be found. Form a book on botany the words of flora in their special
meanings may be found. Textbooks in different subjects provide details
of technical terms related to the particular branch of knowledge.
On
the basis of its quality, the extraction can be of two types:
(1)
Concordance or total extraction and (2) Selected or partial extraction.
(1)
Concordance or total extraction: this is done from all the general
works of a language. All the words in a text are extracted in all
the contexts of their occurrence. In the beginning the extraction
is of the nature of Thesaurus i.e. collection of all the occurrences
of a word with actual citations. But after sometime, the extractor
knows that the word in some senses is being repeated. Collection of
such multiple information is discontinued at a later stage. A useful
way to do this is to have concordance type extraction for every work
for the beginning portions. But after sometime, if some words are
found again and again without adding any new sense the extraction
is to be stopped. But before deciding to stop collection of extracts
wherein the meaning of a lexical unit is repeated, the lexicographer
should make a thorough comparison of contexts to find out the similarity
and difference in the components of the meanings of the words in its
two or more occurrences. If there is any difference the context should
be extracted, because it would give a further sense.
(2)
Selected or partial Extraction: selected extraction is done for collecting
such lexical units which have not been covered by general extraction.
This situation comes when special type of lexical units are found
in a text, dealing with the life of some people of one or other profession
or social group. For example when one reads Amrut Santan, a Oriya
novel by Gopinath Mahanti, he finds a large number of lexical items
and expressions related to the tribal life of Orissa. Form such a
text words relating to the tribal life may be extracted. A story or
novel with regional and local colour in it e.g. the stories and novels
of Phan?ishwarnath Renu in Hindi, may be selected for extracting lexical
units particular to a region in Bihar in India4. From Nisi Kut?umba,
a Bengali novel by Manoj Basu, a large number of words related to
stealing and house-breaking may be extracted.
All
these extractions can go side by side. But after the extractions have
been done it is necessary to have a checking of the source material.
It may be considered essential at some stage to include more words
from some other type of works.
4.2.3
Collection of data for unwritten languages: for unwritten languages
the data is collected by field method with the help of informants.
The criteria for the selection of informants, their age, sex, cultural
and psychological qualities like intelligence, memory, alertness,
patience, honesty, dependability, cheerfulness etc. have been discussed
in works on field linguistics (Samarin, 1967, Nida 1947 138-146).
Also discussed therein are the ways in which a field worker should
approach and deal with the informants to elicit as much of data as
desirable without either causing annoyance to the informant or antogonism
in him. This would assure faithful and proper elicitation of the proper
language data.
The
number of informants to be employed depends on the scope and the type
of the dictionary. If all the regional varieties of the language are
to be included informants should be selected from all the regions.
In order to ensure optimum data it is advisable to select more than
one informant for every variety. This would also help in checking
and rechecking the data.
In
order to elicit lexical units of as many varied types as possible
it would be advisable to select informants from all the following
groups5.
(a) from both sexes,
(b) from persons of all ages,
(c) from persons belonging to different economic and social groups.
Many
a typical lexical item common among the women may not be elicited
from male informants, it is not unlikely that, at times, only males
may be able to provide such items. Many unwritten languages, especially
a large number of tribal languages are fast coming under the influence
of their neighbouring languages. As a result, new lexical items are
introduced replacing the older stock of the language. The younger
generation is adopting the new lexical items in place of the older
ones leading to a gradual loss of older vocables in the language.
Only older generation knows many of the at-present-dying-lexical-units.
Therefore, the informants of older age would be useful for providing
larger number of lexical units of the type noted above. The younger
generation informant would be equally useful for providing new words
introduced in the language. The representation from different social
groups will ensure the inclusion of words from those groups.
The
method of the collection of data for a dictionary would be slightly
different from that involved for writing a linguistic description
of a language. As Samarin observed "the compilation of a dictionary
is a goal very much different form that of a language description,
especially when the dictionary has a strong ethnographic bias".
(Samarin 1967, 46). It might not take much time for an investigator
to identify the phonemes and the grammatical classes of a language.
All this may also not require a large amount of data as needed for
a dictionary. Because "the collection of a mountain of texts,
whether he can translate them or not is insufficient corpus for such
a project, for it has been adequately demonstrated that long texts
do not necessarily show-up new words". (Lawton 1963, 139, quoted
from Samarin 1967, 46). For preparation of a dictionary of an unwritten
language the lexicographer should have a knowledge of the life and
culture of the people. For eliciting words in an unwritten language
word list especially of a neighbouring language and other elicitation
instruments might be utilized. But in making use of a wordlist its
following limitations must be taken into consideration.
(a)
The lists may contain a good number of lexical units which are quite
unknown, not infrequently, even irrelevant, to the native language
situations. Words like gulf, sex, hyena, harnia, opthalia, sapphire,
niche, sash, ivory, asafetida and several others were not known to
many informants in Jaintia, a dialect of Khasi. The general response
for querries about such items is either 'I don't know' or 'there is
no word like this'. In such situations, asked repeatedly, the informant
either gives generic words for specific objects or tries to coin what
we may call emasculated equivalents. In Jaintia, there is only one
word khlor for stars, planets and all heavenly bodies. When asked
to give equivalent for Jupiter, Venus or Neptune all that the informant
gives is khlor. In the same way the admiral gets its equivalent as
wahE? chipaaii (lit. big soldier) gun powder bam suloi (lit. food
of the gun) picnic bam khana (lit. eat food), breakfast, jastep (lit.
morning meal), lunch ja sngi (lit. day meal) and dinner jammed (lit.
night meal).
(b)
Some of the newly coined lexical items are very artificial. Their
artificiality can be tested by getting them checked with other speakers
of the language. In some cases the native speaker confesses total
ignorance of a lexical item, in some there is agreement with reservation
and in others quite different words are cited.
(c)
There is a total unawareness of local environment and objects in such
lists. Malto has the following words for 'mushroom':
naqlo, dule kora, teele kuttto, tupo, taakno, jibra, kuta pura gejo,
edroosdu, peetqo pot´lo, mookrooosdu and some others.
No
word list either of Hindi, or of Bengali or of English may contain
these words. Every speech community has its own lexicon. The flora
and fauna differ form place to place and so do the customs and the
rituals. The richness of flora and fauna, the varied uses of the flora
and the different lexical units to signify them are difficult to find
in any word list. For example bamboo has a great signification in
the life of the tribals of India, specially those of North Eastern
part, as can be seen from the following lexical units in Angami.
kerie
'bamboo' (generic)
khopri 'a common type of bamboo used for the construction of houses
etc.'
vu_prie 'a kind of bamboo used for making ropes, basket etc.'
vu_ni 'the biggest type of bamboo used for walling purposes'.
kuccierie 'a king of bamboo used for making post etc.
riinyu~/rutsu ' a kind of bamboo used for constructing granery of
rice etc'.
luouu_ 'a kind of bamboo used for making flutes'.
This
can be compared to different names associated with the products and
uses of the item illustrated here by the following lexical units in
Kokborok, a language spoken in Tripura.
wa (n) 'bamboo'
wakkitor 'crooked bamboo'
wajar 'a variety of bamboo'
wakhum 'bamboo earring'
wakolok 'a long bamboo lamp'
watuy 'a ring of bamboo'
wathuy 'a variety of bamboo'
wathop 'bamboo decoration'
wamlang 'a variety of bamboo'
wamlik 'a variety of bamboo'
wasun 'a name give to bamboo'
Elicitation
of such lexical items becomes very difficult in these languages. In
elicitation of data from such lists as noted earlier, many objects
which are an integral part of the life and culture of a people are
likely to be missed by a lexicographer.
In
some cases the initial glosses may not give sufficient clue to identify
the contrasting semantic features of the lexical unit. The possibility
of interchange of words for animals and human beings is not rules
out e.g., pregnant in Malto has two equivalents qabnii and kocitaanii,
the former used for animals and the latter for human beings. An initial
gloss 'those' may be inadequate to bring out the contrast between
the following words in Jaintia:
kitu
'those there (near)'
kitay 'those there (at a distance)'
kita 'those there (not seen)'
kitey 'those there (up there)'
kiti 'those there (down there)'
Angami
has the following words related to 'wine'.
Zhu
'Angami wine made of rice, rice beer'.
Zhutlo 'Angami rice bear'.
ruohi 'a kind of Angami wine'.
Khe 'a kind of Anagami wine'.
Zhuhaelu_ 'a kind of Angami wine'.
The
initial gloss 'wine' may not be sufficient for eliciting all these
words.
How
can a lexicographer ensure maximum elicitation for a dictionary of
such languages? A source book of encyclopaedic nature, a book on basketry,
a book on flora and fauna and any other book with pictures may serve
the purpose well. If the words in the list are grouped in grammatical
classes and semantic domains the lexicographer may find it easy to
elicit the lexical items he is looking for.
A
list of basic words belonging to different semantic domains and grammatical
classes, some of which are listed below, may be tentatively prepared
for elicitation of data for unwritten languages.
(1) Nature- earth, water, sky, events, geographical and astronomical
items, directions, winds, weather, seasons, etc.
(2) Mankind - sex, family, relationship, body parts, bodily functions
and conditions, diseases and cures,
(3) Clothing and personal adornments,
(4) Food and drink - methods of preparation,
(5) Dwelling - part of the house, furniture etc.,
(6) Cooking utensils, tools, weapons, etc.,
(7) Flora and fauna - (including parts of animal anatomy diseases,
cures etc.)
(8) Occupations and professions - equipments, rituals and customs
connected with them,
(9) Road and transport,
(10) Sense perception,
(11) Emotions, temperamental, moral and aesthetic (includes insults,
curses etc.).
(12) Government, war, law,
(13) Religion,
(14) Education,
(15) Games and amusement, entertainment, music, dance, drama,
(16) Metals,
(17) Numerals and system of enumeration,
(18) Measurement of time, space, volume, weight, quantity,
(19) Function words including classifiers,
(20) Fairs, festivals, customs, beliefs etc.
(21) Verbs:
(a) Physical activity
(b) Instrument verbs
(c) Verbs of fighting
(d) Music verbs
(e) Motion verbs
(f) Occupational verbs
(g) Culinary verbs
(h) Cosmetic verbs
(i) Communicative verbs
(j) Stationary verbs
(k) Cognitive verbs
(l) Sensory verbs
(m) Emotive verbs
(n) Other verbs.
This
list is by no means exhaustive6. It might be treated as a sort of
reference point and the related words in the semantic domain might
be elicited on the basis of this list. For example while collecting
words about agriculture, words about the different agricultural products,
the sowing and harvesting time, rituals and ceremonies connected with
them, names of the different parts at different times of growth of
these products and the verbs connected with different actions connected
with them may be elicited. We may take another example. While eliciting
words for teeth the informant may be asked to give words for different
types of teeth, diseases of teeth and cures for them.
It
should not be understood from the foregoing statements that the elicitation
through word lists is a very complete and perfect method. The role
of collection and elucidation of the lexical units from different
types of texts providing greater contextual possibilities should in
no way be undermined. The data collected by word lists might not be
adequate specially from the semantic point of view. The different
meanings of the lexical unit may not be determined and demonstrated
if collection is done of the isolated lexical items and the dictionary
may suffer from the short-comings pointed out by Samarin. "The
chief failure of a field dictionary is that it indicates not so much
the meaning of words, but the fact that they exist. They do not define,
they document". (Samarin, 1967, 208).
The
data for a dictionary collected from the word list as noted above
should be supplemented by data from different types of discourses,
some of which are listed below. (Samarin 1967, 208).
Narrations:
eye witness accounts, reminiscences, instructions on how to perform
certain tasks or how to get to certain destination.
Conversations:
Arguments, dialogues over 'where have you been',
Songs : Lullabies, dirges, dance songs,
Folk
tales: legends, how things come to be, amusing stories, Proverbs and
Riddles,
Names:
personal, topographic, village,
Pseudo-onomatopoetic calls of animals or birds.
The
collection of data in an unwritten language has many problems some
of which are given below:
One
of the most vital problems of collection of lexical units is the segmentation
and identification of a word from the phonetic continuum of the texts.
In written languages which have some tradition of grammar, there are
certain devices and fixed criteria to identify a word. a word in written
languages is generally identified as a meaningful unit, a cluster
of sounds or letters written between spaces, or with potential pauses.
In written languages there is no such device. The lexicographer has
to analyse the data, make a grammatical analysis of the language and
fix the word and the lexicographic unit.
The
determination of the lexicographic word is a more ticklish problem
for languages which are of isolating-agglutinating type, e.g., Khasi
an Austric language and many Tibeto-Burman languages in India. In
these languages the whole grammatical process involves prefixation
and suffication of morphemes. Any number of words may be derived by
merely juxtaposition of various morphemes to the root or the stem.
The grammatical system is very simple for a lexicographer. What items
should be included and in what way has always to be viewed carefully.
Many lexical units formed by this way might be treated under both
the main root and prefixes, the former is necessary from the semantic
point of view, the latter from the point of view of alphabetical arrangement
e.g. Khasi. Nong thaaw aayn. 'legislature' might be treated both under
nong 'agentive marker' and aayn 'law'.
The
problem of collection and selection of set-combinations and compounds
is no less intriguing for a field lexicographer. For written languages,
besides getting some clues of solid and hyphenated spelling about
compounds, the lexicographer has at his disposal enough data wherein
he comes across many occurrences of such units. This helps him in
determining set expressions. For unwritten languages the field lexicographer
has to collect different contexts with varied collocations of words
to fin out compounds and set expressions.
In
elicitation of words from glosses there is always a possibility of
not getting an appropriate word. For example with a gloss 'blow' one
gets a word p?ut in Jaintia for 'blowing flute' but there are different
other words denoting the meaning of 'blow', e.g.
slu
'blow (mouth)'
be? 'blow (wind)'
s?er 'blow (nose)'
Similarly
the following words in Jaintia with shades of differentiation might
not be elicited by the simple gloss 'break'.
pnkhan
'break (stick)'
pya? 'break (bottle)'
tkuc 'break (rope)'
A
great difficulty in elicitation of data for unwritten languages is
presented by anisomorphims7. Two or more than two words for one object
may be found in the source language for which there is only one gloss.
Some examples are given in the preceding section. We may examine some
more of them.
In
Shina there are two words hankal and kutsur for the upper and lower
parts of latch expressed by latch in English. English razor has two
equivalents in Jaintia yuukhi and rati? and frying pan has kharai
and talai in the same language. The words basket and spear have several
equivalents in many languages.
Even
when some object is in the sight of the informant, he might not identify
it for the gloss and give another related word. When pointed to curry
and asked to give word for it the Jaintia informant may provide mnchit
'curry juice' and not yute? the word for 'curry'.
Identification
and determination of meaning in respect of flora and fauna and culture
bound words is a problematic area for the lexicographer of an unwritten
language8. For culture bound words mere one word gloss may not be
adequate. The whole cultural information related with the word is
to be provided. The dictionary of an unwritten language, especially
a tribal language, is not merely a linguistic dictionary. It is more
of an ethnographic dictionary with a considerable amount of encyclopaedic
information in it. So for items with cultural significance data should
be collected about the culture also. For Khasi s?iem the one word
equivalent 'king' may not give the cultural significance. It should
be accompanied with full cultural description.
For
Angami keciesu a one word gloss 'a ritual' would not give the desired
information. It should be accompanied by the following cultural information:
'Practice
of the son of the deceased dragging a boulder in his father's remembrance
if his father has died after four Sha's'.
For
a good dictionary it is not only the collection of the number of words
that matters. All the different meanings in different contexts should
also be given. For this the lexicographer should record as far as
possible all the linguistic and physical contexts9. For this the data
should be thoroughly and systematically collated. The experts should
be compared and similarities and dissimilarities in usage of lexical
units noted to mark the different meanings of the word. the informant
may be asked to produce examples of the collocational possibilities
of a particular lexical unit. He may also be asked to provide synonyms
and antonyms which will give additional help in the determination
of meaning.
Function
words should be given special treatment in the collection of data.
Because of their frequency of use they have diverse and varied meanings.
larger number of contexts need be scrutinized for getting their varied
meanings and uses.
4.3.
Selection of Entries: vrhaspatirindraaya divyam vars??a sahasram
praatipadoktaanaam sabdaanaam sabdapaaraavan?am provaaca naantam jagaama
'vr?haspati
taught Indra vocabulary of oral recitation for one thousand divine
years, but was endless'. (Mahaabhaas?ya-Paspasaahniika'. P. 43)
This
may sound poetical and highly improbable yet it is significant in
that it points to the unlimited number of words of a language or the
potentiality of the language to create new words. As the number of
words is always increasing no dictionary, however voluminous it may
be, can claim to include all the lexical units of a language with
all their meanings, sub-meanings and collocational possibilities.
A lexicographer has to make selection of entries for his dictionary.
The general nature of the lexicographic word has been discussed elsewhere10.
The
selection of entries is determined by various factors viz., size,
type and purpose of the dictionary, the status and formal variation
of words and the different local and social variations in the language.
The lexicographer has to consider specially the following types of
lexical units besides others.
(1)
Neologism: the lexical stock of a language does not remain static.
New objects and concepts are introduced in the speech community. These
objects and concepts are expressed in the language by different ways:
(1) new words and expressions are coined, (2) new meanings are given
to the existing words, and (3) the words are borrowed from other languages11.
The
new words and expressions are coined for different themes ranging
for day to day fashion to the nuclear warfare. "Any word or set
expression formed according to the productive structured patterns
or borrowed from another language and felt by the speakers as something
new is a neologism". (Arnold. 1973. 232). Some of the neologisms
are eventual and ephemeral. They are born today and die tomorrow.
The neologisms are some conditions and as soon as these conditions
disappear these words also loose their existence. But generally they
have longer life. From individual and occasional usages they become
social and frequent. Gradually they are stabilized and become part
of the language. The lexicographer's problem is whether he should
enter all such lexical units in the dictionary or not.
It
depends on the size and type of the dictionary. Bigger reference dictionaries
may include all such neologisms. Smaller and abridged dictionaries
may not have such a scope.
As
a matter of fact, till the words become a part of the language their
inclusion may be doubted. But when they are in a transitory stage,
i.e. they are on way to gaining their place in the language, what
should be done with them? For example after 1967 a particular phrase
aayaaraam gayaaraam used for persons who changed their political loyalities
frequently came into usage. Should a dictionary include this word?
A word rephujii lataa was introduced in Tripura with the coming of
refugees from erstwhile East Pakistan (now Bangla Desh) after 1947.
This is a creeper unknown earlier but now very popular (because of
its luxurious growth). Should such a word find place in the dictionary
but suitable labels (like frequent, rare etc.,) should be employed
with them to mark their currency.
(2)
Obsolete and Archaic words: As new words are born in a language, some
words die also, although their number is smaller than the new words.
Some concepts and objects become outdated. Words and expressions for
them are dropped out of the language in course of time. Such words
are called obsolete words. Similar to these words are another class
of words, which are no longer in general use, but have not become
completely obsolete. Such words are archaic or pracaaralvpta. How
many of such words should find place in a general purpose dictionary?
In a dictionary based on purely contemporary language there may not
be scope for the inclusion of such words. But in a general purpose
dictionary whose aim is also to help understanding texts of the earlier
language viz., Kabir, Tulasi in Hindi, Namdev in Gujarati, Shakespeare
in English etc. the dictionary should include such words, because
some of the words and phrases used by the writers are commonly used
by people. Many idioms and proverbs contain words which have become
obsolete and archaic in the general language. Should such words be
included in a dictionary or not?
(3)
Technical Terms: It is a debatable point if all the scientific and
technical terms can find place in a general purpose dictionary. The
influx of technical terms in a language is quite considerable. Every
day either a new technical terms is being coined or new meanings are
attached to the old words. Merriam and Company has nearly four lakhs
terms for Chemistry alone. How can all these terms be included in
a dictionary? The Webster's III has in all four lakh and fifty thousand
words only.
A
problem related to the technical terminology is about the commonness
and uncommonness of some terms. Some terms are very artificial and
ambiguous. Should these terms be included in a dictionary and be given
preference over their common counter parts. For example almost all
the Indo-Aryan languages have a word bansii, 'fishing hook'. A technical
term aakhet?adand?a is coined for this term. This latter term is very
uncommon and also ambiguous. It may mean even a 'hunting stick' to
some speakers. Sometimes more than one terms are used for an object
or concept. Which one should be included in a dictionary?
(4)
Proper names: A lexicographer is faced with the problem of the inclusion
of proper names in his dictionary. Proper names do not form part of
the language system. Their inclusion in monolingual dictionaries has
been questioned by many scholars. But some proper names in course
of the history of a language attain a special significance. They form
an integral part of the cultural life of the people. Any information
about them is a help in the interpretation of the cultural life to
outside would. Let us take the word gangaa. Ganga is a river, intimately
connected with the culture of India, rather it is a very vital part
of the Indian culture. The name appears frequently in the literature
of Indian languages. So this name should be included in any dictionary,
which presents information on Indian theme. The national heroes, mythological
characters etc. should find a place in the dictionary. Christ, Mohammed,
Mecca, Bethelehem, Vetican although proper names convey something
more than their mere referential meaning. They should find place in
a dictionary.
Many
a time, the proper names being used in special sense develop into
general words and become a part of the lexical stock of the language.
We may examine the following entries in, a Hindi dictionary with the
word gangaa which has attained certain special meanings of 'any river'
or 'white colour'.
gangabaraar.
'the land formed by the shifting of the current of a river (gangaa=river)'
gangasikasta.
'the land eroded by any river'
gangaajamnii
'mixed, white and black (cf. The meaning white and black of Ganga
and Yamuna respectively, based on the colour of their water)'.
cf.
Bengali gangaa yamunaa- (2) 'of white and black colours', 'mixed with
gold and silver'.
Gangaala
'a pot for storing water'. (ganga= water)
Cf. also
Banaarasii.
An adjective from Banars emans a 'sari', and Magahii from Magah (ÐMagadha)
means 'betels'.
When
proper names are related to common nouns they can be given in a dictionary.
Shcherba gives the example of khlestakov, a character of Gogol's Revizor,
as an impudent lier and a flap. The word has lost its specificity
and has a derivative xlestakov-scina (Srivastava 1968 p. 122; Zgusta
1971 245.) In Hindi we have kumbhakarn?a for a man who sleeps too
much with its derivative kumbhakarn?ii nidrra 'the sleep of the type
of kumbhakarn?a In Bhagiirathaprayatna the name bhagirratha has lost
its specificness and is used for 'great perseverance'.
Another
point to be noted about proper names is this that if some derivatives
from proper names are included in the dictionary, the proper names
must be included in the dictionary. e.g. Webster's Seventh New Collegiate
has the entry.
Muhammadan
adj. of or relating to Muhammed or Islam.
But
Muhammad is not an entry. The meaning of Muhammadan is derived from
Muhammad. In such cases it is desirable to include such proper names
even though the general policy may be against their inclusion in a
dictionary.
(5)
Empty words: some words occur in certain constructions only they are
not used independently e.g. Hindi aas used only in aas paas 'nearby'
ar?os
used only in ar?os par?os 'neighbourhood'
ajaayab
(pl. of ajab 'strange') used only in ajaayabghar 'museum' and ajaayabkhaana
'a curio collection centre.
aamne used only in aamne saamne 'face to face'.
This
type also includes words which are used in some collocations only
e.g. fro only to and fro. Such words should be included in the dictionary
with suitable cross reference and specific indication of their peculiarities
of occurrence e.g.
Hindi aas see aas paas
Eng fro see to and fro
(6)
Affixes: Should the dictionary enter all the prefixes and suffixes
or only a few of them. As a matter of fact, all the productive prefixes
and suffixes should find place in a dictionary. e.g.
English anti-, mal-, Skt prati-, anu-,Hindi su-, ku-, -pan Khasi nong-'agentive
marker', jing- 'prefix for forming abstract noun and noun of action'.
(7)
Function words: As the function words have no referent or denotatum
they have no lexical meaning. They have only functional or relational
meaning. so it may be contended whether they should be included in
a dictionary or not. Their inclusion in a dictionary can be pleaded
on the following grounds:
(1)
The function of relation or function words especially in all their
collocations cannot be predicted.
(2) A word may have lexical meaning in one context and grammatical
in other. e.g. Eng. does in
He does not like to hear this argument.
He does his works in time.
And Hindi laayaa in
Vah kitaab nahiiN laayaa 'he did not bring the book'.
Vah bazaar se sar?ii sabjii ut?haa laayaa.
'he
brought rotten vegetables from the market'
(3)
The function words have greater occurrence in the language than content
words. Because of the frequency of their use they have larger variety
of functions and greater collocational possibilities. Hence there
is a greater scope of sense discrimination in them12.
(8)
Compounds: In the process of their formation and type compounds differ
from language to language. Compounding involves joining of more than
one stems/affixes either free of bound forms. In compounds the components
are integrated and function as a single lexical unit in a sentence.
The chief characteristics of compounds are their indivisibility (no
word can be inserted between its elements) and the specific order
of the components so rigidly fixed in the arrangement in which they
follow each other than no element can be reversed. Semantically the
meanings cannot be generally, but not exclusively, derived form the
sum total of the meanings of the components. In a book of this type
there is no scope for going into details of the formation and types
of compounds. Our concern is to consider whether all the compounds
can be entered in a dictionary. While entering them in the dictionary
some peculiarities of compounds should be kept in view by the dictionary
maker. These peculiarities relate to the formal and semantic characteristics
of the compounds.
Formal
Characteristics: formally the components entering into a compound
are united phonetically or graphically or both. They are characterized
by unity of stress or intonation or solid spelling, hyphen13 etc.
(a) Some of the components do not undergo any morphophonemic change
while entering into a compound e.g.
Hindi
cir?iimaar 'flower'
sahabaasii 'one who lives in a city'
niikamal 'blue lotus'
Bengali
kaalaraatri 'the night on which death or some calamity occurs'.
balipus?t?a 'crow'
Skt. arun?anetra 'red eyed'
ghoraruupa 'of a frightful appearance'
Marathi pEsaapaavalii 'as cheap as dust'
Malayalam
anangakriid?aa 'amorous sport'
(b)
Some components undergo morphophonemic changes while forming a compound:
e.g.
Hindi
hathakar?ii 'hand cuffs'
pancakkii 'a water mill'
In
some languages the morphophonemic change is so very significant that
the components loose their formal identity. Such compounds are very
common in Khasi e.g.
langbrot
'sheep'< blang, 'goat'
?erlangthari 'whirlwind' < l?er 'wind'.
theysotti 'vergin' <knthey 'female', 'woman'
Since
the compounds function as lexical units they are naturally candidates
for dictionary entry. But longer compounds with many components as
found in Sanskrit cannot be entered in a dictionary. The Sanskrit
Dictionary (Poona) includes compounds with two or three components.
Longer compounds are avoided.
As
for the first group of compounds the lexicographer does not face any
problem. The second group has to be carefully scrutinised because
of the change in their shape.
Semantic
characteristics: the compounds have the following two semantic features
which a lexicographer should take in view:
(a)
The meaning of some compounds is not derivable from the combined meaning
of its components. The meaning of the whole is not a mere sum of the
meanings of its components. One or both of the components usually
loose their meaning partially. The components collectively refer to
another word, e.g.
English
chatterbox ' person who talks a great deal', hot-mouse ' a building
for growing plants'.
Hindi
dasaanana 'one who has ten faces i.e. Raavan?a
ganesa 'the chief or lord of the people', 'name of a God'.
Bengali balipus?t?a 'crow'
Basantaduuto 'cuckoo'
The
whole group of Bahuvriihi compounds comes under this.
(b)
in some other cases none of the components looses its meaning and
the whole meaning is equivalent to the compound meaning of its elements
e.g.
Hindi
cir?iimaar 'hunter' (lit. one who kills birds).
niilkamal 'blue lotus'.
raajaputra 'the son of the king'
Eng. oil rich 'rich in oil'.
Beng. Pallimangal 'the welfare of the village'.
For
a lexicographer the compounds whose meanings can be predicted are
not so important as those whose meanings cannot be predicted form
the combined meanings of its components. The compounds whose meanings
can be predicted may be given as sub-entries under the entries for
the first component whereas those whose meanings cannot be predicted
are generally given separate entries. Selection of compounds as main
or subentry, at least in certain cases, is decided on extra-linguistic
considerations.
(9)
Set expressions of words or multiword lexical units: Also called phraseological
units, set combinations of words or set expressions are word groups
consisting of two or more words whose combination is integrated as
a unit with or without specialized meanings of the whole e.g.
Hindi
nau do gyaarah honaa 'to make good ones' escape'
aaNkh dikhaanaa 'to be angry'
Bengali
cokhe aangul diye dekhaano 'to make clear by use'.
cokherbaali (fig) 'a man who is sore to the eyes'
English
fall out, give in, cut no ice, bread and butter.
The
set expressions are to be distinguished from the free combinations.
In free combinations words are combined to express different meanings
in different ways. e.g.
Hindi
t?hand?aa paanii 'cold water' t?hand?aa mausam. 'cold weather'
garam duudh 'hot milk' garam hawaa 'hot wind'
tej churii 'sharp knife' tej dimaag 'sharp mind'
English
live happily, live miserably, live comfortably.
good man, good news, good book etc.,
Bengali
bhaalo chele 'good boy' bhaalo khobor 'good news'.
bhaalo byabohaar 'good behaviour'
The
free combinations are created as and when the speaker wants to communicate
such ideas. They are not stable in use. The maximum communicative
function of the language is performed by such expressions. But as
their meanings are predictable i.e. the meanings are the sum total
of the meanings of the constituents, the lexicographer does not enter
them in his dictionary14.
As
opposed to the free combinations, the set expressions are not created
on the need of communication. They are fairly stable in use.
Let
us compare the following examples:
Hindi
tej 'sharp' and churii 'knife' combine to form a phrase tej churii
'sharp knife'. In this tej may be substituted by any other word like
naii 'new', puraanii 'old', acchii 'good', kharaab 'bad' to form phrases
naii churii 'new knife' puraanii churii 'old knife', acchii churii
'good knife', kharaab churii 'bad knife'. There is no change in the
denotational meaning of churii. In the same way the word churii may
be replaced by any word like talwaar 'sword', kulhaar?ii 'axe' to
form combinations tej talwaar 'sharp sword', tej kulhaar?ii 'sharp
axe'. Here again, there is no change in the denotational meaning of
tej in combination with these words. Not only that these two constituents
can also be substituted by words semantically similar to them e.g.
pEnii 'sharp' (pEnii churii 'sharp knife') and caakuu (pEnaa caakuu
'sharp knife') and there will be no change in the total meaning of
the combination. But in miit?hii churii 'a sweet spoken traitor',
'a cheat in friend's garb' neither miit?hii nor churii can be substituted
by any other word whether synonymic or similar functionally without
a change in the meaning of the phrase. Thus miit?hiichurii is a set
expression.
In
the same way the constituent red in red flower may be substituted
by blue, white or any other word denoting colour without in anyway
changing the meaning of flower. But if the word red in red tape is
changed to blue or white it would mean 'a tape or ribbon of certain
colour' and the total meaning of red tape 'beaurocratic method' would
be changed.
Another
difference to be noted between the free combination and set expression
is regarding the semantic relationship between the constituents of
the phrase. In the former the relationship between the constituents
of the phrase. In the former the relationship is additive. Each element
has greater semantic independence. In the latter the information or
meaning is not additive. The constituents are fused semantically.
Besides
these, the set expressions have some other characteristic features,
which are given below:
(a)
They are generally indivisible, Nothing could be inserted between
them e.g.
Eng. red tape, good morning
Hindi gudar?ii kaa laal. 'a diamond in rags' (a precious thing in
most shabby quarters)
niir? kaa panchii ' home sick'.
niim hakiim 'a quack'
Bengali chelemaanusii 'senseless'
din duupure 'in the broad daylight'
(b)
They can be substituted, sometimes, by single lexical units, and thus
they can be called word-equivalents.
Eng.
be in a brown study 'be gloomy'
Hindi caaNd kaa t?ukr?aa - sunder 'a beauty'
cal basnaa - marnaa 'to die'.
caltaa purjaa - caalaak 'cunning'.
The
set expressions have the same function in a sentence as a lexical
unit. So they are included in the dictionary.
(10)
Proverb15: Proverbs resemble set expressions in many respects. They
are traditional. Their constituents cannot be interchanged nor any
element can be usually inserted in them. They are usually formed on
set expressions. But can all the proverbs be included in a dictionary?
The proverbs contain many words which are not found in the current
language. They provide information about the cultural milieu of the
speech community. A bigger dictionary may include them but not all
dictionaries. Proverbs are not lexical units in the same ways as the
set expressions and the compounds. They are group of lexical units.
So it depends on the scope of the dictionary to include them.
(11)
Quotations, clichés etc.: Quotations are different from proverbs.
They are taken from literature but gradually, by constant use, they
become part and parcel of the language and their source is forgotten.
As a matter of fact the proverbs themselves have basically the character
of repeated quotations. (Zgusta 1971, 153) Clichés are quotations
which have become 'hackneyed and stale' or stereotypes. The question
of their inclusion depends on the scope of the dictionary. A general
purpose dictionary may not include them.
(12)
Acronyms and Abbreviations: Some of the acronyms, abbreviations and
clippings become very much the part of the language. The usual practice
is to present them in appendices. But some abbreviations can be presented
in the main body.
The
selection of lexical units on the basis of social variations depends
on the scope of the dictionary. the general purpose dictionaries may
include the colloquialisms although normative dictionaries do not
enter them. In the dictionaries using more of oral literature and
unwritten texts the possibility of inclusion of colloquialism is greater
than the dictionaries which are mainly based on written literature16.
Similarly
a dictionary with a normative character might not include words pertaining
to slangs, taboo etc., Smaller dictionaries and dictionaries for learners
also to not have scope for their inclusion.
This
much about the formal characteristic of the lexical entries. How to
decide the number or density of entries in a dictionary? What are
the criteria which help in the selection of entries? One very common
and widely accepted criterion for selection of entries is the frequency
of the lexical items. Frequency counts are specially made a basis
for the selection of entries in a learner's dictionary, because they
provide the vocabulary minimum criterion for selection of entries.
There are many limitations of it.
(a)
There are not many frequency counts especially in Indian languages.
Whatever frequency counts are there, they are based on a very limited
corpus. Many common words are not found in them. For example in Phonemic
and Morphemic frequency in Hindi some basic words like akar?naa 'to
be stiff', akhaar?aa 'a wrestling arena' pakauraa 'fried saltish vegetable
stuffed gram flower preparation' are not included. It depends on the
corpus on which the frequency count is based. Many words of daily
use may not be frequent and be found in the corpus.
(b)
For larger dictionaries, frequency counts cannot be made the basis
for selection of entries from practical point of view. It will involve
analysing a large and unwieldy data.
Frequency
count may be judged from another point of view. Some words are used
only once or twice by some eminent writer of a language while some
others may be quite frequently used by lesser known writers, in cheap
periodicals and spy thrillers etc. What should be the criterion for
their selection in a dictionary? it depends on the scope and size
of the dictionary. a normative dictionary may not include all such
words of the second category. The reference dictionary may do so.
Moreover, if only certain styles are preferred over others many words
and expressions may not find place in it.
TOP
NOTES
1.
For the dictionary of a dead language the collection of data may be
over at one stage and need not continue as in the case of living languages
in which possibilities of the creation of new words and meanings are
not closed as in the case of dead languages.
2. Only the academic points have been taken into consideration. The
financial and organizational matters are not discussed.
3. From Sanskrit Dictionary (Poona).
4. Collection of dialectal materials from novels and poems may sometimes
give distorted idea of the dialectal form, since the novelist or poet
(by his untrained faculties) may create imitative form, which may
not be there in the dialects.
5. Samarin (1967, 61) gives the following factors which are correlated
with speech diversity each of which should in some way be represented
in a god linguistic corpus, age, sex and social class or occupation
of the speaker, speaker's emotion, speed of utterance, topic, type
and style of discourse.
6. Nida (1975, 178-186) gives a very detailed list of semantic domains
grouped into 4 heads I Entities, II Events, III Abstracts and IV Relational
with a large number of sub heads.
7. Difference between two or more language in their phonological,
grammatical and semantic structure.
8. For meaning and definition of flora and fauna see Chapter 5.
9. Malinowski emphasizes the role of three kinds of contexts: the
context of culture, the context of situation and the context of language
for the study of words (The problem of meaning in primitive languages
in Ogden and Richards 1952, 305).
10. Setting of entries 'head word'.
11. See Bull, William 'The use of vernacular languages in education'
in Dell Hymes ed. Language in Culture and Society 530.
12. See also 5.2.
13. See Introduction to Oxford English Dictionary for hyphenated compounds
p. XXXIII.
14. Some dictionaries especially those for learners give free combinations
also, but they also do not give all possible collocations.
15. A proverb is a short familiar epigrammatic saying expressing popular
wisdom, a truth or a moral lesson in a concise and imaginative way.
16. Sledd and Ebbit (1962, 50 ff) has a number of articles on the
controversy on the inclusion of such colloquial words as 'a'nt' in
the Webster's III.