Language and Linguistics



Edited by

A. M. Ghatage

R. N. Dandekar

M. A. Mehendale


Ashok R. Kelkar





            The OED tag ‘On Historical Principles’ is understandably emulated without realizing its full implications.  A historian of human culture or of language has no control over the selection of his data.  The historical processes themselves which he presumably wants to understand have sieved the data for him – from his point of view this sievage may be felicitous or disastrous or indifferent; in any case the result is always going to be fragmentary.  I am of course speaking of his principal data – the so-called primary sources.  In relation to the history of language these are texts (or specimens of the use of language).  Besides being fragmentary, the data is also going to be ‘dead’ in the sense of being a residue of transitory events.  For the older stages of the language, for example, we may have texts available for a postmortem possibly with the help of an interpretative tradition, but no sample user of the language with his Sprachgefűhl.  The historian, in order to make up for this double handicap, resorts to the principle of uniformity.


            Hutton, a practitioner of geology which is also a historical inquiry unlike physics or chemistry, formulates it in the following manner (as summarized in Labov 1971: 482): processes which operated to produce the geological record are essentially the same as the ones now taking place around us – weathering, sedimentation, volcanic activity, earthquakes, and so on.


            A student of human history unlike that of natural history has a second string to his bow – which can be as treacherous as it can be helpful.  The human beings whose history is being investigated are disposed to stand back from their own activities even as they are immersed in them and maintain a running commentary.  Indeed some of these comments may stem from contemporary or subsequent historians’ activity itself.  In other words, the so-called secondary sources of a historian may have been left behind by his forerunners.


            Focusing more narrowly on the history of language we must at the outset make a distinction between a synchronic analysis of a historical stage of a language to which we have access only through texts and not through live contact on the one hand and linguistic history proper which diachronically connects various stages (including of course the contemporary stage if any).  By a historical dictionary we shall mean a dictionary that is historical in this second sense.  Whether we have in mind the first or the second sense of the historical study of a language, we may say that the historian of language has before him as data the following:

(a)       All available texts – written texts or, in very recent times, mechanically recorded texts-duly authenticated, edited, dated where possible, and above all assigned to the correct corpus or subcorpus (in Medieval North India, for example, it is not always easy to say whether the text in Old Braj or Old Khari Boli or Old Braj influenced by Rajasthani or Old Rajasthani influenced by Braj and so on; in Eighteenth-Century western Europe, it is important to say whether the text is a true narrative or a simulated narrative, colloquial or literary, aristocratic or simulated aristocratic or bourgeois and so on; the linguistic features in a manuscript text or letter may be assignable to the official author, the ghost-writer, the redactor, or the scribe).


(b)      All available secondary responses to language (thus, a parenthetical comment like ‘if I may be forgiven for using a vulgar expression’, helps us in locating the expression that follows along the stylistic scale; definitions of technical terms offered by writers in the field or annotations of earlier texts by later commentators are best regarded as belonging to his category).


(c)       All available previous linguistic descriptions (thus, the attestation of a vocable together with its gloss in an old lexicon or any etymological proposals are clearly on a different footing from the primary attestations).



            The dividing line between the last two is not a hard and fast one.  For example, the etymological proposals made by Sanskrit authors may not always be valid, i.e., may remain unusable under [c]; but they may still throw light on the contemporary concepts behind the words so etymologized, i.e., may remain usable under (b).  We will call citations under (b) and (c) meta-citations to distinguish them from the citations proper under (a).  Meta-citations can be treacherous in that they are a fertile source of ghost senses and even ghost words.


            The use of the qualification ‘all available’ used earlier is of course to be taken with several grains of salt.  Only in the really unfortunate case, like that of Gothic or Avesta, is this idea easily attainable and therefore painfully desirable.  (For such languages, the historical dictionary may well turn out to be a concordance for the whole corpus).  In the case of Sanskrit the problem is the opposite one – of principled selection from the embarrassing riches.  The selection will be at three points – selection of texts for extraction, selection of excerpts for filing, and selection from filed excerpts for citation in the published dictionary.  The discussion of the principles that will help us in grading the texts in order of relevance or in fixing the density of extraction for each selected text or in determining the density of extraction for each selected text or in determining the size and composition of a fair sample is beyond the confines of this paper.  I may simply note in passing that the single rubric like ‘Sanskrit’ may conceal a variety in spite of the relative stabilization, even fossilization of certain of its varieties over long stretches of time.  The coinage of rubrics like ‘Vedic’, ‘Classical’, ‘Neo’ is only the first approximation – we have to make finer chronological, geographical, social, and stylistic slices.


            While the shifts observed in the citations and metacitations in the given language will constitute the primary residue of historical processes, one must not lose eight of the evidential value of data from (a) ancestral languages (e.g. data from Pali for a historical lexicographer of Sinhalese); (b) cognate languages; (c) descendent languages; (d) donor and recipient languages; and (e) substratum and superstratum languages. In the case of Sanskrit, evidence from Avestan, Old Persian, Middle Iranian, MIA (especially Pali in its triple capacity as descendent, recipient, and substratum), NIA (especially early NIA), Tibetan, Dravidian, Munda, Arabic, Old Javanese, etc. may be helpful in various ways – including the reconstruction of obscure senses.  There is of course no documentary evidence for the pre-Sanskrit stages.


            In the long and arduous process of collation and interpretation that will follow and that will in turn feedback to the activity of collecting and selecting the sources should be envisaged as falling into two somewhat distinct passes corresponding to the vital distinction between chronicle and history proper.  A chronicle records the more accessible historical facts and documents them.  A history reconstructs or extrapolates the less accessible facts and interprets and theorizes about the processes that account for all the facts – whether accessible or extrapolated.  The existing historical dictionaries are (when they are historical at all and not merely sketchy guides for the contemporary reader to the interpretation of the older texts of the language) are either chronicles or chronicles struggling to be history but not quite making it.  It is far from my intention to suggest that a chronicle is merely a poor cousin to history.  A good chronicle is an indispensable foundation for a good history; it is good precisely because it is conscious of its importance. A bad chronicle is bad either because it is sham history or because it is not inspired by a sense of relevance but merely compiled.





            Now, it is perhaps true that given the present state of Sanskrit scholarship, given the vastness of the total corpus, and given the chronic uncertainties of the pre-Muslim chronology of India, we can only attempt a lexical chronicle for the family of languages known as Sanskrit or Old Indo-Aryan postponing a lexical history to a rather remote future.  At the same time, in order to become a good chronicle this lexical chronicle should be inspired by a clear notion of what a good lexical history of Sanskrit would be like.  One may even go a step further at the practical level, and say that at least some of the entries in it should be preliminary sketches of properly historical entries.  More modestly one may require of this lexical chronicle a historical sophistication.


  (1)      A historian’s sense of evidence, scholarly conscience, and sense of fairplay to his reader will lead him not only to the careful weighing of evidence and sifting of the well-established fact from the strong conjecture and the open question but also to put the reader in possession of the evidence and the theories rejected.  A historical dictionary – whether of the chronicle variety or of the history one – will do well to be liberally sprinkled with question marks.  It will separate the editor’s conclusions and guesses from the evidence and give a “fair” sample of the evidence and the contending theories – so as to permit the reader to draw his own conclusions.  Where the editor is doubtful about the relevance of a piece of information he will include it with the label “potentially relevant”.


(2)           A clear distinction should be established between primary citations – our level (a) – and secondary meta-citations – those at level (b) or (c) or bordering on either.


  (3)      Where absolute chronology cannot be maintained, attempts should be made to evolve techniques for establishing and presenting to the reader a relative chronology – at least a relative slabwise chronology comparable to the geological stratum-labels like Upper and Lower Jurassic.


  (4)      One must not mistake the merely marginal or ad hoc (and therefore attractively curious or exotic) for the central core feature (and therefore often dully quotidian or run-of-the-mill).  A felicitous literary departure is grist to the linguistic historian’s mill chiefly because it may set a new norm for later writers to follow.  A sampling of mediocre writers is as much needed as taking the unavoidable classics.  A historical dictionary is not a chrestomathy of literary jewels, though of course it certainly need not deliberately avoid the jewels.  A locus classicus is important because it is very often a pace-setting event.  The other pitfall is that the commendable desire to be exhaustive may rob the editor (and therefore eventually the user of the dictionary) of the proper perspective.  It is more useful to give a larger number of citations exhausting all the occasional or short-lived uses of that vocable. Marginal uses include instances of words coined ad hoc only to be discarded later and also insertions from another language not as a borrowing but only as a citation or a temporary switch from the main languages.  While offering explicit statistical information or even intelligent guesses about the shifting ‘popularity’ of a word or a word-sense from period to period may not be feasible, a historical dictionary may find it possible to imply such information “in a rough and ready way, - by the proportions of references given under the different heads” (Aitken 1971).


 (5)       The true historian is not the one who is lost to the present, but rather the one who has a lively sense of (a) the past that lives in the present and (b) contemporary history that is being made around him.  It should not sound incongruous if I urge the prospective lexicographer of Sanskrit to look at contemporary chronicling of semantic change, the record of the lexical stock market (vocable A displacing vocable B in sense X after a tough competition), and the birth and death register of words – as seen, for example, in American Speech or Vie et langage.  (A convenient sampling from British English is available in Bhide 1948, 1970.) OIA vocables must have been subject to the same forces of weathering, sedimentation, and volcanic displacement to which NIA vocables are subject.  In talking about volcanic displacement I have alluded to the possibility of major discontinuities of usage.  Only an unawareness of this fact and of the earlier-mentioned distinction between central and marginal can explain but not justify the blitheness with which post-Dravidic or neo-Sanskrit innovations are often “supported” by Sanskrit scholars by authentic but irrelevant citations and, worse still, metacitations.  (The justification, if indeed it can be called such, is very often the Brahmanical reluctance to admit that the well of Sanskrit has been defiled by non-Indo-Aryan borrowings or substratum interferences.)

            The insights gained by an awareness of the living past and contemporary history and their application through historical imagination to the traditional scholarly exposure (what is known as “being steeped in Sanskrit”) should be made available to the team of editors and eventually to the readers by being codified in a series of notes on recurring features, problems, cultural domains, and processes.  Thus there could be a series of alphabetized entries entitled, say, astronomy and astrology, ayurveda, compounds, easternisms, euphemisms, gatis and other idioms, kosa literature, literary expressions, nyayas, plant names, Prakritisms, proverbs, verb paradigm, and so on.  The entry on compounds will, among other things, spell out the editorial decisions as to what compounds to include as part of the working capital of the language (Sprachgut) and what to exclude as ad hoc nonce-formations that, like most phrase and clause level formations, constitute merely the linguistic turnover.  The entry on euphemisms will tell us, for example, what the euphemizable items are, what the characteristic modes are, whether there have been any major changes, and so forth.  (One may thus report that English ladies stopped sweating in the 19th century or that Marathi speakers hide behind English borrowings in talking about unpleasant things like widowhood, impotence, or even death.)  The drafting of such Guide Entries at an early stage in the editing is a desideratum.  They will set up a comprehensive code of mutually consistent basic editorial decisions so that the specific decisions in a given entry will not be ad hoc decisions but applications of certain principles.  Of course these principles will also spell out how much ground is left for the exercise of editorial discretion. It is only fair that the reader should be given a glimpse of the editorial kitchen.  (A selection of these has a claim for being included even a sample fascicule.)





            Finally, one may turn to the vexed question of the arrangement of the various uses of a vocable within the entry in a dictionary.  On the face of it, the answer seems to be fairly straightforward: in a descriptive, synchronic dictionary the arrangement should be in an order of descending frequency as seen in the attestations of the period under consideration; in a historical, diachronic dictionary the arrangement should be in an order of increasing recency of the first attestation of that sense. If this apparently straightforward answer does not work, it is not merely on account of practical exigencies like the reader’s convenience and the printer’s incompetence (this last factor cannot certainly be ignored in an underdeveloped country).  The reasons are deeper reasons.


   (1) The synchronic order is certainly not liner – sense 1, sense 2, etc., in a mechanical order of frequency will be a travesty of how the word behaves actually.  The uses at any given stage in the history are best thought of as a multiply spreading and branching network.  Ideally, a descriptive lexical entry will open with a diagram consisting of a set of variously interconnected nodes.  Since the order of printing is linear, the subdivisions will be arranged and numbered in a quasi-linear fashion : to take a sample :


            1, la,  lal,  2a,  2b,  2bl,  2b2, 3, ….


            This is an alphanumeric notation which alternates figures and letters and which selects one node (numbered 1) as the “unmarked” or neutral meaning – often called Grundbedeutung.  As a rule but by no means invariably this basic meaning is the most frequent, the least specialized, the most general, the lest figurative, the historically earliest, etc.  The critical test is whether the meaning is the one which will most readily suggest itself to the user if the vocable is mentioned out of context rather than used within a context.  In relation to a synchronic account of a language that is no longer current, one may want to recast this test slightly: the basic meaning is the one that an editor interpreting and annotating a text is most disposed to pick up so long as there are no clear contextual pointers in a different direction.  In the rare cases where a vocable may have more than one equally viable neutral senses (for example, “rank, class”, “sequence, arrangement”, and “mandate” in the case of the word order in English), one can start with a summary paragraph of main senses 1, 2, 3 the order of the subsequent paragraphs being based on some non-synchronic extrinsic criterion (Consider the Concise Oxford Dictionary entry for the word order).  Note that a given sense under one vocable may bear a relation of perfect synonymy with a given sense under another vocable.  The lack of perfect synonymy normally refers to the vocables in their total ranges.  An incidental advantage of this notation is that it permits us to make use of the anuv¤tti principle – everything said under 1 is to be carried forward to 1a, 1a1; that under 1a to 1a1; and so on – unless the contrary is stated.


(2)                The diachronic order is also more complicated.  Suppose there are five main historical stages (to be denoted by Roman numerals) and six senses (to be denoted by capital letters). The situation may well look like this in a chart that may be placed at the beginning of the entry.


            A         I—IV

                        B            II—III, V

                        C            III

                        D            III—IV

                        E            III—IV

                        F            IV—V


            Note the eclipse of Sense B in stage IV and its revival in stage V.  It is not enough to hunt for the earliest citations far a given sense; one must hunt for other ‘critical’ citations- the latest, the earliest revival, the latest before a gap, the transitional citation linking two senses, etc. Again, while the order between C, D or between C, E can be rationalized under the chronology principle, that between D, E cannot. The actual dates or first available attestations are seldom the dates of first usage except in the special case of the innovative locus classicus.  The same goes for the dates of the last attestations or of gaps in attestations, Resort will have to be taken, therefore, in such a tie (as the one between D, E) to some non-diachronic, extrinsic criterion.


(3)               Even assuming that the difficulties mentioned under (2) could be resolved and assuming that the dated record is ample enough to permit such clear periodwise assignment of the various senses, the fact remains that in a fundamental sense the sample chart may be a travesty of history.  A true picture will rather look, say, like the following:


                        I                       1

                        II                      1, 1a, 2, 2a

                        III                    1, 2, 2a, 3

                        IV                    2, 2a, 3, 3a, 3b

                        V                     2, 2a, 2b, 3, 3a, 3b, 3b1


            The loss of sense 1 after stage III may thus render senses 2, 3 disjoint.  Also, there may be reorderings.  Sense 2b may change its “loyalty” and get attached to 3; what is merely a minor ‘shade’ or ‘slant’ under Sense 1aat one stage may become a distinct Sense 1a1 at the next stage; Sense 2 may get “promoted” to the status of being Sense 1, while the erstwhile Sense 1 may become marginal; what are Senses 1 and 2 at one stage may have been etymologically quite different vocables or at least etymological doublets at an earlier stage; and so on.  In essence, the diachronic picture has to incorporate the successive synchronic pictures like the stills in a film.


            The distinction between chronicle and history is based, to some extent, on the distinction between the documentation of the past and its reconstruction.  Since documentation and reconstruction feed on each other and since a good chronicle is always struggling towards being a good history, the two perspectives, synchronic and diachronic, cannot be kept apart.  A good diachronic account calls for a prior sound synchronic analysis of each of the stages.  Also, the diachronic order and the synchronic order of the latest stage may often resemble each other.  Thus, the sample alphanumeric set given under (1) above may reflect the diachronic order.


                        I                       1

                        II                      1, 2

                        III                    1, 2, 3

                        IV                    1, 1A, 2, 2a, 2b, 3

                        V                     1, 1a1, 2a, 2b, 2b1, 2b2, 3


            This suggests that an arrangement of this entry in the following manner will not be too difficult to read or too distorting.  Thus, the earliest sense is also the unmarked, basic sense.


                        1/ I-V

                        1a/ IV-V

                        1a1/ V

                        2/  II-V

                        2a/ IV-V

                        2b/ IV-V

                        2b1/ V

                        2b2/ V

                        3/  III-V


            It is true that this cannot accommodate reorderings of the kind described earlier.  But this need not seriously disturb us if we realize that such reorderings are not very common; that, where they do occur, supplementary visual aid can be provided in the entry; and that, for a dictionary that is a lexical chronicle only struggling to prefigure a future lexical history, all this is unrealistically ambitious in any case.


            We must not of course let our preoccupation with citations and their arrangement make us lose sight of the simple fact that a historical dictionary is also a dictionary and as such faces all the problems that a synchronic descriptive dictionary faces.  If it doesn’t face them, it means it is not doing its whole job.  (I have elsewhere (Kelkar 1970-1) discussed some of the general problems faced by and dictionary whether historical or not and whether bilingual or not and whether bilingual or not.)  The tendency to look upon the explanations of meaning in a historical dictionary merely as convenient expendable tags for each of the citation subsets in an entry is dangerous. (Cf. Aitken 1971: In getting out the defining characteristics of the genre of which OED is a paradigm example.  Aitken frankly states: “The definitions and descriptive notes, which are also a normal feature of such dictionaries, may be regarded as fulfilling a somewhat secondary purpose, that of sign posts or labels to the particular subset of quotations which follows.”)  The same goes for the neglect of idioms and collocations.  After all the reader has as much right to the earliest citation for the idiom as such as to that for the individual words in the idiom. The reader of a historical dictionary of Bangla would want to know since when Bangla speakers started “eating” water and cigarettes.  Finally, the dictionary tends to take upon itself the duties of a word finder (or thesaurus) and a cultural encyclopaedia and the historical dictionary need be no exception. A historical dictionary need be no exception.  A historical dictionary of English should not only trace the trajectories of the two words hound and dog independently but also their intersection – dog displacing hound as the least specialized word for Canis familiaris.  This could be achieved by appending a word-finder-type synonymy section to the entry for dog.  Similarly the entries for teacher or nurse will not only record the first attestation and increasing use of she in relation to these nouns but also connect this with women’s emancipation.




            The danger, to my mind, is not the non-realization of the ambition, but the pedestrian shirking of an ambition.  In other words our modesty may land us with a bad chronicle.


            I am aware that this paper is short on live examples and long on simulated schematic examples illustrating an ambitious theory.  But a non-Sanskritist can justify his presence in a gathering of distinguished-Sanskritists discussing a Sanskrit historical dictionary only by venturing to tread on a ground that the angles may want to keep away from.







I.  The Anatomy of a Historical Dictionary


(a)    Guide to the scanning of an entry

(b)   Guide consisting of alphabetized entries

(c)    Skeleton linguistic analysis-rules for proceeding from phonological spelling to phonetic spelling(s); from grammatical analysis into morphemes to phonological spelling; from grammatical label to privileges of occurrence; from orthographic spelling to phonologic and phonetic spelling; from transliteration to original orthography; from base to productive derivatives

(d)   Body of the dictionary

(e)    Appendix of alphabetized entries of borderline cases like proper names, bound affixes and bases, abbreviations.



II.  The Anatomy of a Historical Dictionary Entry


(a)      Lemma: transliteration, original script spelling, phonological spelling, phonetic spelling (or any subset of these in terms of which the rest can be predicted)

(b)     Alternant spelling (of various sorts)

(c)      Grammatical function-class label

(d)     Etymology: ancestral, co-successor, successor forms  (taking you beyond the confines of the language concerned

(e)      Grammatical structure: simple or complex?, if complex the constituents and the structure – type label operating entirely within the language concerned)

(f)       Accompaniments: inflectional paradigm, syntactic selections, idiomatic collocations, concord and government label

(g)      Explanations: descriptions, glosses, cultural notes

(h)      Citations (critical and other) and metacitations

(i)        Word-finder section: comparables (synonyms, antonyms, hyponyms, hypernyms, words confused) and derivatives (affixal, reduplicative, compositional)

(j)       To be inserted at appropriate points in the above arrangement are distinctive alphanumeric tags ( 1, 1, 1a, etc. ), style labels, social labels, region labels, period labels



Note: In Sanskrit the traditional grammatical analyses (in terms of U¸ādi sūtras, the man grammatical sūtras, etc.) will be cited under (e), whether valid or not (if not valid, to be introduced by “pace”).  The traditional etymologies will be cited under (d) if valid and as metacitations under (h) if invalid.  The annotations in commentaries will be cited as metacitations under (h).  The loose term derivation spans (d) and (e) above.




AITKEN, A.J. 1971.  Historical dictionaries and the computer. In : The Computer in literary and linguistic research … a Cambridge symposium. London: Cambdrige University, Oven, 1971, p.3-17.


BHIDE, H.S. 1948.  A Study in the development of the English vocabulary.  U. of Bombay, Ph.D. diss. Unpublished.

--                     1970.  Lexicographical notes on English.  Indian Linguistics 31, 162-73. Based on Bhide 1948.


KELKAR, Ashok R. 1970-71.  The Anatomy of a dictionary entry with samples proposed for a Marathi-English dictionary, Indian linguistics 29 (1968). 143-9; 30(1969). 50-64.


LABOV, William. 1971.  Methodology. In: DINGWALL, William Orr (ed.). A Survey of linguistics science. College Park, Maryland: Linguistic Program. U. of Maryland. Pp. 412-91.





In revising and slightly enlarging this paper from the version read at the Seminar, I benefited from the discussions. This was published in: A.M. Ghatge et al. ed. Studies in historical Sanskrit lexicography, Deccan College, 1973, p.57-69.


            The paper was presented at a Seminar on Historical Sanskrit Dictionary at Deccan College, Pune, December, 1972. subsequently I came across Aitken 1971 which confirmed some of my hunches.