ABSTRACT:
Sound quality is defined; segmental and overall
articulatory qualities are distinguished from each other and from overall voice
qualities. A classificatory grid for overall sound features is proposed; where
possible, auditory distinctions are correlated with physiological and acoustic
distinctions. Uses for the conceptual framework for future work in speech and
music are suggested.
The term ‘sound quality’ denotes the
field of those auditory distinctions (a) that are not to be fitted in the pitch
and loudness scale; (b) that are needed along with the pitch and loudness scale
to enable us to say that sound of a specified pitch, loudness, and quality lasts
for so long or shifts through time to another sound; (c) that broadly correlate
with the acoustic information on the energy distribution profile in the wave along
the frequency scale; and (d) that correlate with the sound production information
on the complexity of the vibrations of the sound sources and more importantly
on the damping and resonance channel properties of the speech gestures.
Quality distinctions in human speech
are broadly of two sorts :
(i) Those subject to relatively greater
control through modifying the controlled and ballistic gestures of the mobile
portions of the speech tract and through these gestures modifying the ‘shape’
– complex or otherwise – of the vibrations in the sound sources and the larynx
itself before being modified) is believed to be a tone approximating a
rectangular wave (cf. Joos 1948: § 2.1); the whistle tone on the other hand approximates
a sine wave (and hence a pure tone). The shape of the oral cavity and the activation
or otherwise of the nasal cavity are largely responsible for the vowel – like
qualities superimposed on the glottal tone. Acoustically, these distinctions correlate
with the presence of formats (energy concentration zones along the component frequency
scale) and the presence or absence of an identifiable fundamental frequency. Auditorily,
these consist in the vowel and consonant qualities (a) identified as segments
and syllables or (b) identified as overall qualities. We shall not be concerned
with the segmental qualities in this note. To the latter sort of overall qualities,
we shall give the name – overall articulatory qualities.
(ii)
Those much less subject to control
which physiologically correlate with the degree of vocal-fold tension, presence
or absence of moisture, shape and size of the laryngeal-pharyngal cavities (partially
controlled by the local musculature and by the placement of the root and the far
back of the tongue), and the like. Acoustically, these correlate with the ‘fillings’
in the formant zones (Joos 1948). To these we shall give the name overall voice
qualities.
So we have –speech qualities (i)
(a) Segmental articulatory qualities
(b) Overall
articulatory qualities
(ii) Overall voice qualities.
To
the extent that any of these are beyond conscious control, they serve to identify
the speaker (or the singer), to give a clue to the state of the body (sunken,
drunken, choked voice, bad cold), or to characterize the language (or mode of
singing). Now we shall set up a grid for classifying (i) (b) and (ii) in terms
of independently variable features.
Overall Articulatory Qualities
(1)
Over fronted / Over retracted – with articulations relatively further fronted
/retracted in the mouth. Thus, Hindi and Urdu are respectively over- fronted/
over retracted.
(2)
Palatalized/Labiovelarized – with an accompanying y- or i- quality/w- or
u- quality i. e. respectively with the front of the tongue raised towards the
hard palate/with the back of the tongue raised towards the soft palate and lip
corners brought closer to each other. Thus; :aggressive
ve∙vņe/ regressive ‘pouting’
in Marathi is respectively palatalized / labiovelarized. 1
(3)
Nasalized / Denasalized-with slight nasalization of the non-nasal sounds
/ with slight denasalization of the nasal sounds. Thus, whining is nasalized,
a bad cold denasalizes speech.
(4)
Breathy – with glottal friction superimposed on normal voice or breath.
Hus, in Gujarati or Hindi an h consonant is often dissolved into a breathy
voice (shown have by underlying) in Gujarati gāndhi as gāndhī
and Hindi tarah as tara). Drunken speech often has over-voicing which is
probably the opposite of Breathiness.
(5)
Creaky—with glottal trill superimposed on normal voice or breath.
Thus, very low voice in singing or speech often goes with a creaky quality (as
of an unoiled door hinge).
(6)
Whispery – with whisper-glottis replacing voicing in the voiced sounds.
Thus, extra soft speech is often whispered i.
(7) Falsetto – with whisper-glottis replacing
voicing in the voiced sounds. Thus falsetto may be used by a male mimicking a
female voice and in certain types of singing.
Overall
Voice Quality
(8-9)
I propose that overall voice qualities can be placed along two independent
scales shown below :
Lightly Damped Highly Damped
Acute
(energy concentration Light voice
Strident voice
in higher overtones)
Grave
(energy concentration Mellow voice
Heavy voice
In lower tones)
The term ‘strident’ with its connotations of unpleasantness is clearly
unsatisfactory. A better substitute is invited from the reader.
Overall
Non-qualitative properties
Though this Note is not concerned with
Pitch, Loudness, and Concatenation, it will be useful to mention the overall properties
in respect of these since they are often confused or associated with Articulatory
Qualities or with Voice Qualities. The first three are pitch features, the next
two loudness features, and the last two concatenation features.
(10)
Overhigh/Overlow – with pitch movements in the relatively higher-lower
portions of the pitch scale. Thus, females and children tend to speak overhigh.
(11)
Overstreched/Oversqueezed – with pitch movements spread over an extensive/restricted
portion of the pitch scale. Thus, speech and certain modes of singing may be monotonous
(i.e. oversqueezed).
(12)
Pitch Quaver (vibrato) – with extra rapid up and down pitch movement superimposed
over the normal pitch movements of speech or singing.
(13) Overloud (forte)/Oversoft (piano) –
with loudness movement in the relatively louder/softer portions of the loudness
scale. Thus, shouting is both over-high and overloud.
(14)
Loudness Quaver (tremaloso) – with rapid increase and decrease alternating
in loudness. Thus, laughter is often characterized by loudness quaver (effected
in this case by spasmodic breath quaver).
(15)
Rapid (allegro) /Medium (moderato) / Slow (lento) Tempo – with relatively
rapid/medium/slow changes in segmental articulatory qualities and in normal pitch
movements, and more and longer / medium/fewer and shorter breaks and pauses. Also
tempo changes: accelerando/rallentando.
(16)
Overabrupt (staccato) / Oversmooth (legato) – with relatively more abrupt
/ smoother transitions in respect of segmental articulatory qualities and normal
pitch and loudness movements.
A loudness feature corresponding to (11) is possible, but appears to be
of little practical significance.
Scope for future work
(a)
Determining more reliably the physiological and acoustic correlates of
these various distinctions. Thus, overhigh and overlow when distinguished in the
same person’s voice are thought to be correlated with a certain positioning of
bones and cartilages and musculature (voice set or register). Consider the beginning
towards a physiological characterization seen in Catford 1964.
(b)
Verifying the hypothesis about the two factor characterization of voice
qualities proposed here – perhaps in terms of factor – analysis of responses to
stimuli in an experimental set –up. If one sought to extend it to non – human
sound sources (such as musical instruments) an attempt will have to be made to
arrive at distinctions analogous to the one between (ia ), (ib), and (ii). The
‘strokes’ of a tabla or sitar thus would seem to be analogous to speech syllables.
(c)
Carefully defining non – technical descriptions. Thus we have earlier proposed
to define shouting as overhigh, overloud speech. This will be closely linked with
the immediately preceding line of investigation. Some typical English terms that
lend themselves to this sort of conceptual analysis are :
wheezing, crying, pouting, breaking, choking, groaning, moaning, whimpering,
yodeling; ventriloquist’s voice; vibrant, full, strong, rasping, dropped, thin,
covered, closed, faint, suppressed, muffled, clear, sharp, flat, dark, deep, rich,
shrill, hard, guttural, harsh, dry, jarring, husky, thick, throaty, hollow, booming,
sepulchral, ringing, soaring, spooky, sultry, gentle, syrupy, velvety (all adjectives
of voice); stammer, stutter, sotto voce, chanting, singsong, shouting, yelling,
squealing, slurring, groan, moan.
Note
also the German terms Schonstimme and Kraftstimme.
(d)
Last but not least, an interpretation (i.e. an identification of the functions)
of all these distinctions in modes of speaking and singing (such a steady of course
presupposes some success in the other three lines of investigation ).
The investigations will have to be undertaken jointly and severally
by phoneticians, linguists, elocutionists, dramaturgists, and musicologists with
the aid of physicists, physiologists, neurologists, and experimental psychologists.
REFERENCES
Catford,
J.C. 1964. Phonation types: The Classification of some laryngeal components of
speech production. In :Jones, Daniel (dedic)1964.
Jones,
Daniel (dedic) 1964. In memory of Daniel Jones. London: Longman.
Joos,
Martin 1948. Acoustic phonetics. Languages monographs. Baltimore, Md.: Ling soc.
of America at the Waverly Press.
COLOPHON
Those interested in evolving
a modern Indian Sanskrit-based terminology may consider the following suggestions
:
Speech
Sound features svana-lakâana
Pitch scale sura-
šreņī /sāranī
Loudness
scale bala- šreņī /sāranī
Speech quality svana- guņa
Articulatory quality parayatna- guņa
Voice quality ghoâa- guņa
Segmental
features varņa- lakâana
Overall
features adhivyāpī- lakâana
Concentration
features samhita- lakâana
Overfronted/
Overretracted adhi-purogata/ adhi-parāgata
Paratalized/Labiovelarized
(clear dark) adhi-tālu-ra´jita/adhi-oâ¶ha-m¤dutālu- ra´jita
Nasalized/Denasalized
adhi-sānunāsika /adhi-niranunāsika
Breathy
mahaprāņa- ra´jita
Creaky
spandana- ra´jita
Whisper
yupāṃšu- ra´jita
Flasetto
(voice) bhraâ¶a-(ghoâa)
Voice/Breath
ghoâa/švāsā
Acute
tivra
Grave
komala
Lightly
Damped ninādī
Highly
Damped ruddhanādī
Over high/Overlow tāra/mandra
Overstreched /Oversqueezed caplasura/mitasura
Pitch quaver Vibarto dolitasura
Overloud= Forte/ Oversoft =Piano adhi-prabala /adhi-durbala
Loudness quaver = Tremoloso dolitabala
Rapid = Allegro /Medium =Moderato /Slow = Lento Tempo
Druta-/madhya-/vilambita-laya
Overabrupt = Staccato /Oversmooth = Legato adhi-khaņ∙ita /adhi-dravita
Notes:
(1) The prefix over- does not mean here ‘excessive’ (ati-), but
‘overall, not localized’. (2) Sanskrit svara- means both ‘vowel’ and ‘tone,
note’; we propose here to make use of the modern differentiation between svara
‘vowel’ and sura ‘tone, note’. (3) Sanskrit dhvani-, svana-
both mean ‘sound’; śabda- means ‘sound, speech sound, word.
We propose to use svana for ‘speech sound’; dhvani for ‘sound (in
general)’; šabda ‘meaningful speech sound’. Thus dhvani-vijñāna
will be ‘acoustics’ and not ‘phonetics’; in talking about the sound quality of
a musical instrument, we use dhvani- guņa and not svana- guņa;
I am indebted to professor S. N.Salgarkar (Deccan College) for some useful suggestions.
This was published in Indian Linguistics
35:222-6, 1974. in talking about sound features in general and not about speech features, we shall use dhvani
language and not svana-language.
1 The aggressive palatalization, which occasionally breaks
into [yæ yæ yæ] conveys impudent challenge, exasperation and annoyance, and often
accompanies jeering repetition of what other person has just said. This is chiefly
used between children and between women who have set dignity aside. The regressive
‘pouting’, often accompanied by nasalization, conveys the desire to be petted
or treated indulgently after an error or a misdeed, to pet or console, to whimper.
This is chiefly used between a child and an adult in close relationship and between
lovers (often as a part of baby talk). The latter must be distinguished from another
kind of labialization which is accompanied by overflow pitch, lowered-jaw-articulation,
and heavy voice. This combination (we can call it ‘booming’) is used iconically
in describing something as large or forceful. This is chiefly used by or to a child in telling a story or narrating an
adventure. Dr. P. Bhaskar Rao and Dr. Amar Bahadur Singh inform me that all three
are found in Telugu and Hindi. It will be interesting to examine India as a paraphonological
area with subareas.