SPEECH OF THE HEARING IMPAIRED
Chapter-I: Speech Reception in the Hearing Impaired

1.1 Introduction

This short monograph is the result of a project undertaken by the authors on the speech and language characteristics of the hearing impaired population with Kannada as the environmental language. The research project was undertaken for two years from April 1981 to April 1983. Two main types of hearing impairments were focused in our research. The subjects of the study, who were eleven in number, fell under the categories of moderately and profoundly hearing impaired. Data from these hearing impaired subjects was collected based on a language evaluation tool prepared specially for the purpose by the authors. The language evaluation tool consisted of the following sections:

Personal data which gave the name, age, sex, religion, place of birth, present address, mother tongue, and other languages known, educational level and occupation of the subjects. Section-1 focused on the history of the problem, Section-2 on the audiological evaluation, Section-3 therapeutic history, Section-4 home background, Section-5 educational history, Section-6 psychological history, Section-7 language, Sections-8 and 9 occupational history and other relevant aspects. These sections were followed by a discrimination test in which separate sub-sections on production of individual sounds, discrimination of individual sounds, discrimination of words, imitation of individual sounds and imitation of words, sentence types and phrases were included. There was also a comprehension test in which lexical recognition was the focus. Commands were given to the subjects, and their ability to follow these command was tested. There was a separate section of conversation in which the subject’s responses to questions put to the subjects were collected. There were certain grammatical functors that were focused upto both for production and comprehension. These included pos-positions and plural markers. The voice qualities of the speech in terms of the rate of speech, intelligibility, rhythm and intonation were also looked into. The evaluation tool and authors used in this project, is presented as Appendix-1.

This monograph restricts itself to the presentation of the salient findings of the project as regards the speech characteristics of the hearing impaired. We have organized this monograph in such a way that it will be found highly useful for students of linguistics and adjacent sciences to know something more about the speech characteristics of the hearing impaired even as it presents the salient findings of our research project. The monograph consists of three chapters. In Chapter-1, we present the general characteristics of the manner in which speech is received, characteristics of auditory speech perception in the hearing impaired, perceptual aspects of acoustic cues as found in the hearing impaired, characteristics of visual, tactual speech reception and characteristics of speech reception through audition and vision, and speech reception through vision and tactual modalities. Chapter-2 presents the characteristics of the early linguistic behaviour in the hearing impaired. Focus is on the vocalization, babbling characteristics and the nature of prelingual deafness. In Chapter-3, we present the speech characteristics of the hearing impaired. Methods of collection and analysis of speech samples of the hearing impaired, errors of segmental and supra-segmental sounds, are dealt with. Speech characteristics of the hearing impaired as revealed through instrumental analysis are also dealt with. The authors are very well aware that much needs to the done in identifying speech production, reception and perception characteristics of the hearing impaired population with Indian languages as environmental languages. In this monograph we have made some attempt to interpret speech and language characteristics of an Indian language as found used in the hearing impaired population. This must be treated only as a preliminary report.

1.2 Speech Reception

Speech reception is generally a pre-requisite for speech comprehension, which, in turn, is a requirement for normal speech and language production. Generally speaking, speech comprehension precedes speech production in the process of first language acquisition, even though certain structures may be produced before they are fully comprehended.

Perception of speech is noticeable in children, in some manner, very early, say, at six months. From the absence of speech perception ability in the early months, the child progresses towards the adult level of speech perception ability in the early months, the child progresses towards the adult level of speech perception ability as his age increases. From early reflexive perception of perceiving one’s own sounds, the child begins to acquire perceptual skills of attending to and localizing the sounds emitted by other sources. While one cannot suggest easily a programmatic step by step sound reception from early reflexive perception of one’s own sounds to early attending abilities to other sounds, the categories of reflexiveness, early attending, regular attending and localizing the sounds as categories that dominate the sound perception in babies can be easily attested. It is also noticed that hearing levels continue to improve though ages of normal children. For example, hearing levels were found to be better at eleven years than at seven for normal children (Richardson, et al., 1976). Also there is a qualitative difference between auditory processing in adults and animals, since auditory processing in humans are conditioned and shaped by factors that the both part of specific human physiology and anatomy of hearing mechanisms and human social processes. In a way the auditory processing in young children, rather babies, may be partly compared with the auditory processing in animals. But even here the comparison, in certain ways, may not be appropriate as the relationship between the appearance of cochlear microphonic potential and the onset of auditory function is not identical. Also the first steps of auditory development begin prior to the postnatal period in the human child, whereas the other species are at variance with one another in this regard. The normal child takes nearly nine to twelve months to start his expressive language.

Communicative behaviour in later life is closely linked with auditory competence in early life. With the absence of hearing, or the presence of distorted hearing in the hearing impaired child, the speech perception of the child is also distorted. As a result, the expressive aspect of speech and language of the child is distorted. With hearing impairment, the children have problems with short-term memory (Manning, et al., 1977). Child has difficulty in acquiring language efficiently and in an appropriate manner for successful communication. Auditory deprivation from birth leads to higher level processing problems, both linguistic and psychological. Word order difficulties, difficulties with discrimination of phonemes, delayed and devaint phonological development are some of the problems that are readily attested. The deaf speech has a temporal form distinct from the speech of the normal hearing. Also note that the deaf children do not develop the same hemispheric specialization for language processing just as normal hearing children do.

In order to compensate for his inability to accurately perceive speech through the ears, the child supplements the use of vision along with audition, or uses vision alone as the primary modality for speech reception. Thus, both the motor movement patterns and acoustic patterns made by the speaker during speech production are utilized by the hearing impaired child for speech perception. Normal hearing children differ from one another in several ways with regard to the frequency of use of vision independent of audition and in aid of audition; there is also difference noticed in the functions of audition and vision in communicative acts between the two. The quality of audition and vision, and of the dependence on vision for audition are some other factors that distinguish the two.

Before we proceed further, it is necessary to characterize the process of perception in general and speech perception in particular. Perception refers to the process by which an individual organizes and interprets sensory data he has received on the basis of his past experience; it is an act of categorization, according to which stimuli are received, identified, sorted and given individual meaning (Eisenson, 1972). The term speech perception refers to the process by which the motor/acoustic patterns of speech become linguistic structures in the listener’s mind (Boothroyd, 1978). Auditory discrimination is an integral part of auditory processing. Auditory discrimination includes abilities to contrast sounds in the environment as well as the sounds and their patterns in language. By auditory discrimination, we mean an ability to discriminate between the sounds at the articulatory, acoustic and other cognitive levels.

Speech perception begins with the recognition that what is heard is speech and not noise. This recognition is based on the recognition that what is heard is patterned, that what is heard is uttered in a communicative environment, and that what is heard is a specific language (based generally on the comprehension possibilities), and that what is heard is addressed to someone (facial expressions, face to face communication, physical direction of speech, etc.).

Usually, the hearing impaired use audition and vision as primary modalities for speech perception. Attempts are made also to help perceive speech through taction, in addition. Note that a combination of all the three modalities is easily recognized in the speech perception processes of the normals. There is, however, a subtle difference between the speech perception in the normals and the hearing impaired. In the normals a dominant role is assigned to audition in speech perception, whereas such a dominance of audition is affected in several ways in the hearing impaired. The complementary nature of the participating three modalities in normals is different from the same found in the hearing impaired.

1.3 Auditory Speech Perception

Broadly speaking, one could identify two types of approaches to the study of speech perception, namely, the motor theory of speech perception and the auditory theory of speech perception. The current research on the role of motor movements in perception, especially of speech perception, indicates that a child perceives sound sequences in terms of his own motor reactions. This motor theory of perception does not deny the importance of the acoustic events, but gives priority to the motor kinesthetic feedback loops in the comprehension and use of language. The perception of speech sounds is seen more closely related to the articulation than to the acoustic stimulus. Speech is perceived by the articulatory movements and their sensory effects mediate between the acoustic stimulus and its perception. The articulatory movements that a listener makes in reproducing the acoustic patterns help in determining the cues for perception of words. Linguists lie Twaddell (1952) also are of the opinion that listeners classify sound sequences on the basis of motor articulation rather than acoustic properties. Further some psychologists like cherry (1957) suggest that if there is an order in acoustic and articulatory have been mediated. The problem of the hearing impaired, in this school, then, is seen related not to the problem of the hearing impaired, in this school, then, is seen related not to the problem faced by the deaf in correctly perceiving the acoustic values of the sounds produced, but the failure on the part of the deaf to relate the acoustic cues to appropriate motor reactions.

The study of auditory perception of speech includes both the physical aspects of sound and the physiological aspects/apparatus for audition. Auditory perception is conditioned by the physical properties of a particular cochlea. If the frequency of the sound received is below or above that which can be perceived by a particular cochlea, there is no question of auditory perception. Hearing means variations in the acoustic intensity. One receives information by variations in the acoustic intensity. This acoustic intensity is converted into a nervous message. When this message is integrated with a structure already available to the individual in his cognitive level, hearing has taken place.

The auditory perception may be imagined to be taking place in several phases. Berry presents three phases in auditory perception of speech, while agreeing at the same time that such identification and segmentation of auditory processes of perception are merely a kind of abstraction for the matching of specific neural activity with the temporal phase. In the first phase, activation of neurons takes place, and this requires a chemical mediator which is responsible for the sensitivity of end organs. Further the change in the end organs should be transformed into a form of energy capable of discharging the nerve terminals. In the second phase, the wave pattern makes connections with various parts of the hearing apparatus. While these connections are being made, modification and discrimination of the auditory patterns continue. At this stage, information from other modalities, particularly from the reticular system, is available for organizing and focusing on the perceptive field of audition. All the systems that are relevant for perception in general pool their resources in arriving at a comprehensive auditory perception of speech and other events. At this phase, short-term memory (use of retained auditory patterns from earlier inputs) is utilized. Analysis by synthesis of various number and kinds of information is the highlight of the auditory perception process in the third phase. In linguistic terms, ordering and sequencing of syllables, words, phrases and sentences are achieved in this phase.

Findings in the study of auditory perception more or less dominate approaches to the study of speech perception by the hearing impaired. With advances in the techniques to measure the acoustic qualities, and with the availability of sensitive electronic devices, greater emphasis is now laid upon the study of auditory perception of speech for practical ends. Generally speaking, in the study of auditory speech perception, audiograms, analysis of the acoustic correlates of speech and experiments on speech perception play a very important role.

The acoustic cues for the recognition of speech are sought generally in terms of phonetic discrimination based on underlying acoustic configurations for each of the phonemes and for their combinations. Acoustic cues emanate from the source (breath and voice) and signal (vocal tract) characteristics. The speech process may be viewed as a series of basic changes in the flow patterns (and resultant acoustic characteristics) of the breath stream, and modulations in the form of secondary shifts within and among the frequency, intensity and time parameters of sound. There are four major entities noticed in the formation of acoustic cues. These are the soundless hiatus, periodic sound, frictional noise made when structures along the vocal channel are brought close, and impulse noises made as a result of complete momentary stoppages of the air stream.

The acoustic cues for recognition of the voice of an individual revolve around the fundamental frequency of the voice, its intensity and the voice quality. Since the total range of fundamental frequencies encountered in speech of men, women and children put together extends from 60 Hz to about 500 Hz (Fery, 1973), the hearing impaired child should have hearing in the range of 60 to 500 Hz in order to detect the presence or absence of voice, and to process acoustic cues to perceive sounds and their combinations.

Next come the acoustic cues for the discrimination of consonant and vowels. There are certain differences noticed between vowels and consonants in terms of energy concentration. The distinction between vowels and consonants is generally agreed to the made. So, we may try to identify the acoustic cues for a distinction between the vowels of the language exposed to the hearing impaired. In Kannada, the following values may be identified, as an extrapolation of data provided in Rajapurohit, 1982.

TABLE 1 : Relative duration in milliseconds of vowels in Kannada in the initial, medial and final positions, given in descending order (adapted from Rajapurohit, 1982).

Initial Medial Final
o: (196.66) u: (168.00) i: (138.16)
Ə: (194.00) a: (157.80) a: (138.06)
a: (169.50) e: (151.16) e (188.85)
u: (150.00) o: (146.22) u ( 84.98)
i: (132.00) i: (136.41) i ( 80.81)
e (114.00) o ( 84.00) a ( 68.54)
o ( 98.00) e ( 83.16)
Ə ( 75.14) a ( 71.84)
i ( 75.00) Ə ( 64.08)
a ( 67.13) i ( 60.08)
u ( 64.73) u ( 58.05)

TABLE 2 : Relative intensity of Kannada vowels in millimeters in the initial, medial and final positions, given in the descending order (adapted from Rajapurohit, 1982). Note that the intensity is measured in watts or computed in decibels. On the basis of oscilliograms, used by Rajapurohit, 1982, neither of the measurements was possible; measurement, here, is offered in terms of the height of the highest period or peak from the central line, the positive side, in millimeters.

Initial Medial Final
Ə: (9.5) e: (9.5) i: (9.0)
o (9.0) i (9.0) e (8.5)
e (8.5)i: (8.5) a: (8.5)
a (8.5)e (8.5) i (7.0)
a:(8.5)a: (8.5) u (3.0)
i: (7.0) o (8.0)
o: (7.0)a (8.0)
i (6.0) u: (8.0)
u: (6.0)Ə (7.5)
u (5.5) o: (7.0)
Ə (5.0)u (7.0)

TABLE 3 : Format frequencies of vowels of Kannada (adapted from Rajapurohit, 1982).

Vowel F1 F 2 F 3
i 280 22002800
i: 305 2300 3100
e 400 2000 3500
e: 500 2200 3500
a 750 1500 2500
a: 800 1400 2600
Ə 500 1200 2500
Ə: 500 1200 2500
o 500 1000 2200
o: 600 1000 2500
u 260 460 1000
u: 300 500 1000

The acoustic cues for individual and combined sounds are described in terms of formats – the resonant bands – as seen in a spectrogram. The range of frequencies are clustered around a central peak frequency or pole. One each side of the pole, the formant slopes off to zero intensity. It is well known that the formant frequency structure is important for the discrimination between vowels. In isolation, most vowels are identified if both first and second formants are heard (Liberman, Delattre, Cooper and Gerstman, 1954). If F 3 is also heard, it improves the identification of high front vowels. Form the formant frequency values obtained for Kannada vowels (based on Rajapurohit, 1982), the following hypotheses can be made.

When we consider only the durational aspects of sounds, we find:

i) Distinction of length is well established in the initial position among vowels
ii) However, there is no contrast between short and long a and e in the initial position. Only the longer counterpart e: is found occurring.
iii) With a residual hearing of 64 msecs. to 197 msecs., a subject should hear all the Kannada vowels in the initial position.

Thus, as regards the vowels in the initial position, in Kannada, the hearing impaired may be expected to have difficulty with regard to distinguishing length as a significant marker. Since there is no contrast between e and e: in the initial position (according to the analysis adopted here) the hearing impaired may have difficulty in the perception of the mid vowel e or e:. If the residual hearing is distorted (that is, if the residual hearing is not within 84 and 197 msecs.), we should expect the hearing impaired to have difficulty with the perception of Kannada vowels may be determined based on the available range of msecs., in the residual hearing of the hearing impaired. It may also be noted that the durational aspects clearly maintain a distinction in msecs., between the group of short vowels and long vowels.

Thus if a subject has a residual hearing upto 76 msecs. (75.14 msecs. of Ə being the maximum), he is likely to perceive all the short vowels. The vowels lend themselves to be grouped distinctly under separate groups of short and long vowels, but there is the possibility that only some short and long vowels to the exclusion of others are perceived.

A similar pattern emerges as regards the acoustic cues of vowels in the medial position.

i) Distinction of length is well established in the medial position among the vowels.
ii) However, there is no contrast between Ə and Ə:. Note only the shorter counterpart Ə is found occurring.
iii) With a residual hearing of 58 msecs. to 70 msecs., (168 msecs. of u: being the maximum), a subject should hear all the vowel sounds in the medial position. Distortion in this range reveals some hearing impairment or other, with regard to some vowels.

The durational aspects clearly maintain a distinction between short and long vowels. However, a higher residual hearing is demanded in the medial position (84 msecs. of o being the maximum). But a much lower residual hearing (58-05 msecs. of u would help perceive a speech sound in the medial position in Kannada. In other words, there is difference between the low levels of residual hearing demanded for the perception of Kannada vowel sounds when they occur in the initial and medial position. In general, all the Kannada vowels while occurring in the medial position have lower msecs. of duration in comparison to their occurrence in the initial position. Thus, for the perception of Kannada vowels occurring in the medial position, a subject requires a lower (durational) residual hearing than for the vowels that occur in the initial position.

In the word final position, only a few vowels occur. Two vowels, namely, e: and u: are not attested. The value for e, however, is much higher than the value for e in the initial and medial positions, and thus e has almost the value of longer vowels, whereas in the case of u and u: it is the shorter vowel that is attested. In the word final position, contrast between only two vowels (i/i: ; a/a:) is found. With a residual hearing of 68.54 msecs. to 138.16 msecs., a subject should be able to hear all the vowels that occur in the final position in Kannada. With a residual hearing upto 118.85 msecs. a subject will hear all the short vowels of Kannada that occur in word final position.

In order to hear the F 1, F 2 and F 3 values, a frequency range of hearing between 260 [F 1 of u] to 3500 Hz [F 3 of e and e:] should be present. With residual hearing upto 2300 Hz, the first two formants of all the vowels can be heard. Residual hearing upto 800 Hz is necessary for F 1 of all vowels to be heard. With similar formant (F 1) values, visual cues should be made use of. For example, Ə, e:, o and e: all have similar F 1 values (500 Hz). Lip rounding can be visualized for o but not for e:, Ə, and Ə:. e: and Ə and Ə: differ in terms of tongue positions. Further durational cues help to discriminate between Ə (75.14 msecs.) and Ə: (194.00 msecs., if in initial position).

In addition, the vowels vary in intensity. As can be seen from Table-2, there is a range of intensity for vowels in Kannada. u is most often the soft vowel while Ə:, e: and i: are loud vowels. It can further be seen that the intensity of the vowels varies with respect to their positions in the word. The acoustic cues for the diphthongs depend on the component vowels. Position of occurrence of a vowel or a consonant is of great consequence for the hearing impaired in the correct perception of sounds uttered.

A combination of acoustic cues often serve to discriminate between the consonants. These include the transductions, invariant energy, variant energy, intensity of the consonants, and their durational variations. For English consonants, Fletcher (1970) has listed these contrasts. In Kannada and other Indian languages, such contrasts are yet to be determined. However, it is possible to hierarchically arrange the consonants in Kannada (in a specified context) with respect to intensity, as can be seen from Table-4 below.

  C1 C2C1 # C2   
CVVCVC1C2C1 C2C1 C1Ch Aspiration
12 3456789
Š(6.0)y(8.0)v(8.0)š (6.5)y(7.5)y(8.0)I(7.5)c̣̣ (5.0)ḍ̣(7.0)
c̣̣*(6.0)ḷ̣ (8.0)ḷ̣ (8.0)v (6.0)n (5.0)l (7.0)y (7.0) j (4.0)d (7.0)
v (6.0)ṇ̣ (8.0) y(6.5)ṣ̣(6.0)l (4.5)n (6.0)v (6.0)------
r (4.0)v (6.0) š(6.5)c̣̣(5.0)ṇ̣(4.0)m(5.0)l(6.0)---b (5.0)
n (4.0)l (6.0) m(6.0)I(5.0)m(3.5)ṇ̣(5.0)n(6.0)---c̣̣(5.0)
l(3.5)m(6.0) ṇ̣(6.0)ṇ̣(5.0)j̣̣(3.0)c̣̣(4.0)m(6.0)---j̣̣(4.0)
y(3.0)š(5.5) ṇ̣(5.0)y(4.5)---j̣̣(2.0)c̣̣(5.5) ------
J*(2.5)n(5.5) l(5.0)ḷ̣(4.5)r(1.0)r(1.0)n(5.0)---k(3.0)
m(2.5)ṣ̣(5.0) ṣ̣(5.0)n(4.5)------j̣̣(3.0)---g(3.0)
s(2.5)j̣̣*(5.0) n(3.5)r(4.0)------------ p(3.0)
h(0.5)c̣̣*(4.5) c̣̣(3.5)m(3.0)------------ t(2.0)
---s(2.5)j̣̣(3.0)s(3.0)------------t(1.5)
---h(2.0)s(3.0) j̣̣(3.0)---------------
---r(1.0)h(3.0) ------------------
------r(1.5)--------- ---------

* Affricate

In the CV consonant š is the loudest consonant and h is the softest. In the VCV context y is the loudest and r is the softest. In the C1 C2 context v and š are the loudest, and r and j̣ are the softest consonants. In the C1 # C2 context, y the loudest and j̣ and r are the softest. In the C1C1 context l is the loudest and j̣ is the softest. In the Ch context c̣ is the loudest and j̣ is the softest, whereas among the fully aspirated sounds d is the loudest and t is the softest. Note that non-stop and non-nasal consonant sounds are the loudest in Kannada, and note also that the same non-stop and non-nasal consonant sounds are also the softest.

In general, the following comments may be offered. The peak of noise energy for the noise burst in the plosive is located between 600 to 800 Hz for labials; for the velars in the region of 1800 to 2000 Hz and for the alveolars at about 4000 Hz. All these values are for English consonants. As regards the fricatives the noise energy is generally between 1800 to 2000 Hz to about 6000 Hz and beyond for some of the fricatives. Many severely hearing impaired children find all the fricative noises inaudible. The spectrograms of the affricates are essentially the same as those of a stop followed by a fricative. The cue associated with the nasal consonants include low frequency formant between 250 to 300 Hz. The nasals are seldom confused by the hearing impaired, because both variant and invariant cues associated with the nasal consonants are present in the low frequency range. The acoustic characteristics of the glides are also described in context with the following vowels. Independent status of glides and semi-vowels is to be decided in conjunction with their status as independent graphemes in the script. At the spoken level, the appropriate use and recognition of glides is mainly a sociolinguistic device. For sheer conveyance of information, the glides do not seem to play a role.

The acoustic characteristics of the consonants were reviewed so far. The listener, by paying attention to selected acoustic features of consonants, discriminates between the consonants. Some of these features are available to the hearing impaired and some are not. In any case, the availability of these sounds are ultimately dependent on the residual hearing.

By bearing in mind the frequency range of the sounds of the language of the hearing impaired, one can grossly arrive at the features accessible to a particular child. In this, audiometric tests help us decide the residual hearing levels in the hearing impaired subjects, even as they point out to the various norms of adequacy of hearing for speech in the normal hearing subjects. Audiometry helps locate the hearing impairment conditions in relation to middle, outer and inner ear, cochlea, and auditory neural pathway. Audiogram also plays a useful role, although the usefulness of audiogram for correctly characterizing speech perception is limited. Note that, while an audiogram plots the frequency and intensity of the detection of tones, the speech itself and the processes of speech perception are of a complex phenomenon with rapidly changing acoustic stimulus. Note, however, that in the absence of elaborate description of acoustic cues for Indian languages, such as the one found in Fletcher (1970) for English, the audiologists are forced to interpret the audiograms based on findings on European languages only.

In so far as Kannada and other major Indian languages are concerned, the following acoustic values have to be identified: vowels as opposed to consonants; within vowels opposition between long and short vowels; rounded and unrounded vowels; high, mid and low vowels; within consonants, stops (voiceless, voiced, aspirated and unaspirated), affricates, nasals, fricatives, laterals and trills. The consonants are to be viewed in terms of bilabial, labiodental, dental, alveolar, retroflex, palatal, velar, glottal and uvular sounds. As there are some major tonal languages, the investigation of the tones is also necessary. By working out frequencies of each of the significants or units in the normal hearing population and comparing the same with the perception values of the hearing impaired, an acoustic picture of the hearing impaired can be acquired.

This picture, however, is inadequate for at least two reasons. One reasons is that only frequency ranges for individual sounds and not their combinations as in actual speech would be available. Secondly, even if certain frequencies are available, it may be that the configurations of these frequencies made by the hearing impaired may be different from those of the normal language. And yet interpreting the audiogram, etc., on the basis of acoustic values identified for the sounds of the normal language of the hearing impaired would be a good beginning step.

Ling (1976) and others have specified the residual hearing essential to determine various auditory stimulus categories. The idea is to identify the essential residual hearing for specific sounds in acoustic measurements, and compare this with the residual hearing available to a hearing impaired individual. Once the residual hearing of a hearing impaired person is measured in acoustic terms, it is assumed that, generally speaking, the sounds that fall within this residual hearing are available to the hearing impaired individual. Describing the auditory speech reception in the hearing impaired, Ling (1976) suggests that plosives (of English) are audible to children with only low frequency residual hearing. Likewise if a child has only low frequency residual hearing, he is able to hear the stops. Also the nasals in English, he finds, require low frequency residual hearing. However, if the child does not hear higher frequencies, then the nasals are not identified and discriminated among them and other consonants. For example, discrimination between m and n becomes possible only with hearing upto 1000 Hz. As regards semi-vowels, low frequency residual hearing is sufficient. On the other hand, liquids are less likely to be audible to children with only low frequency residual hearing. Fricatives are more often inaudible to many hearing impaired children. The sibilants s and z are likely to be audible only if hearing extends upto and beyond 3000 Hz. With hearing only upto 3000 Hz, the presence of the fricative is determined in the context of back and central vowels. Amplification beyond 3500 Hz facilitates their reception. However, affricates require hearing upto 1000 Hz only.

The voiced-voiceless distinction varies from consonant category to consonant category. For example, the voiced-voiceless distinction in plosives is easy to achieve with residual hearing present below 500 Hz, whereas the distinction is achieved in the fricatives if hearing extends beyond 500 Hz. Both manner and place of articulation distinctions are all available with hearing upto 4000 Hz, whereas with hearing upto 1000 Hz the manner of articulation as regards fricatives, affricates and plosives is not available. If hearing extends only upto 2000 Hz, the place of articulation of a consonant sound is to be perceived in the context of central and back vowels. If hearing is below 1000 Hz the place of articulation of a consonant sound is to be generally perceived with back vowels. Note also that voicing reception varies from one group of sounds to another group of sounds. That is, reception of voice differs based on the manner by which a group of sounds is produced. Also reception of one and the same sound varies in terms of its position in relation to the vowels around. For example, if hearing extends only upto 2000 Hz, many a cue not available in the context of front vowels is possible in the context of central and back vowels.

As regards the place of articulation, those hearing impaired children with residual hearing upto 4000 Hz will have relatively complete information on place of articulation (Ling, 1976). The vowel environment may also influence the cues for place of articulation. Many cues on place of articulation which are unavailable in the context of front vowels if hearing extends to only 2000 Hz, are audible in the context of back and central vowels. In the same way, many more cues on place of production are available below 1000 Hz in the context of back vowels rather than in front vowels. Only one unambiguous cue is available under 500 Hz, namely, the burst associated with the bilabial plosives in a back vowel context.

Ling (1978) has worked out a five sound text. The ability to detect all these five speech sounds, u, a, i, s and š for him, demonstrates the ability to detect all aspects of speech sounds (in English). While this list is, indeed, highly useful for hearing aid selection and also to check the haring aid itself, one should bear in mind that speech perception is not solely acoustic values-bound. Secondly, it is possible that, at least in so far as Indian languages are concerned, some more crucial additions such as length of vowels, aspiration and retroflexion be also included within this range.

There are several questions such as the following to be raised and answered before one arrives at a comprehensive list of sensitive acoustic indices for each language: To what extent the reception of sounds in the hearing impaired is affected also by cultural facts, particularly in the partially moderately deaf of all ages; to what extent the perception of speech sounds is affected also by patterns of distribution of sounds in a specific language; what is the nature of the inability even among the normal hearing to discriminate between sounds in a language; free variation of sounds; ability to discriminate but inability to produce some sounds; ability to produce but inability to perceive some sounds, dialectal influences and the prevalent norm and fashion in the production and perception of sounds and so on. Even as we look into perception based only on motor and/or auditory perception of sounds, we have to look for application of these findings for higher levels of language. For example, word recognition involves more than recognition of stretches of sensations with hiatus. We found in our study that the profoundly deaf subjects were able to produce a semblance of intonation patterns, particularly at the end of an utterance, but this in no way improved the number of words recognized and used by them. While we found it possible to make the profoundly deaf subject recognize some selected sounds, it was not possible for us to help them increase the number of words they recognized and used. In other words, the study of speech perception of the hearing impaired has to be based not only on the acoustic values but also on the other factors, both linguistic and those specific to hearing impairment as a process. Note also that the hypothesis that the relative perceptual salience of certain aspects of the acoustic signal can be predicted by their relative acoustic energy is doubted in several quarters (Goldstein, et al. 1976). All these, however, do not diminish, in any manner, the immense practical value of the approaches to the speech perception of the deaf based on acoustic studies. They only point out the need to supplement the results of these studies.

1.4. Perceptual Aspects of Acoustic Cues

The acoustic analysis of speech sounds helps us in understanding the probable difficulty associated with a specified type of hearing loss. However, in order to arrive at a comprehensive picture of the hearing loss, the perceptual experiments on acoustic cues, utilizing the hearing impaired subjects are also important. The results of such perceptual experiments give us an idea of how acoustic cues are perceived. The results help us also to develop better methods of auditory training for the hearing impaired and to develop suitable aids for them. Such studies help us also to understand better the relationship between auditory experience and speech production.

The perceptual experiments on acoustic cues have used both natural and synthetic speech as their domains of research. The hearing loss is superimposed in normal hearing subjects through masking of noise and through the filtering of speech. These experiments have focused on the various acoustic features of vowels, consonants and prosodic features and various components of the acoustic features. The band widths, frequencies, intensities, formants and so on have been interfered with in these experiments to find out the indispensable values in each of the above for a correct perception of a particular sound. The relationship between the place of articulation and the manner of articulation in perception is also investigated. The perceptual hierarchies between various places of articulation, and the perceptual hierarchies between various types of manner of articulation are also studied.

We present below some of the salient findings of the perceptual experiments on acoustic cues which are generally accepted in the field. These findings are based on experiments which used the linguistic variables of the English language. Perceptual studies on acoustic cues utilizing linguistic variables of Indian languages are not available to the authors.

1) The phonetic features used by deaf listeners to perceive and discriminate speech sounds are the same as those used by normal list listeners.
2) Voicing and the nasality are much less affected by masking noise than are the other features.
3) Under all conditions, features of place of articulation of the consonants are more poorly discriminated than the voicing and manner features.
4) The features of place of articulation are found more difficult to be perceived. Note that perceived manner is seen to affect perceived place. The processing of the place feature depends upon the value the listener assigns to the manner features, rather than directly on any acoustic cues to manner.
5) The voicing features are discriminated better than the manner features.
6) For the consonants, the place of articulation was poorly perceived by the hearing impaired sensorineural listeners.
7) Initial consonants are perceived better than final consonants by hearing impaired listeners.
8) Vowel discrimination is better than consonant discrimination for severely impaired sensory neural listeners.
9) The sensorineural listeners among the hearing impaired, in general, discriminate the back vowels better than the front vowels.
10) Perceptual experiments of suprasegmentals in hearing impaired listeners are not available to the authors. However, the presence of hearing in the range of 100 Hz to 500 Hz is said to be sufficient to perceive the intonation patterns (Ling, 1976). The range is said to encompass the hearing range in male and female adults and children.
11) The reception and production of prosodic features is poor in deaf. Highest error rates occur for intonation.
12) The perception of temporal duration in hearing impaired is different from normals.
13) Preceding vowels have an influence on the fundamental frequency of vowels, and voiceless consonants. Final consonants show no regular effect on the fundamental frequency of the preceding vowels.
14) The mildly impaired listeners (with flat audiograms) perceive both the mid-frequency patterns and the low frequency patterns equally well.
15) The sensorineural listeners have a better perception of low frequency speech patterns, such as voicing, nasal murmurs and the first formants of vowels.
16) Low frequency patterns are better perceived in a wide range of degrees and types of sensorineural impairment.
17) The fundamental frequency variations are more closely related to the perception of both stress and intonation.
18) The higher fundamentals, longer durations, and greater intensity of syllables influence the perception of stress.
19) The subjects with serve to profound hearing impairment are found having difficulty in discriminating the second formant region of vowel sounds. The first formant position and duration of vowels are well discriminated by all (except by the more profound cases).
20) Auditory perception could be different at different levels of language structure. While independent sound may be more easily perceived, their combinatory operations in the form of phonemes, syllables, words, sentences and discourse could be different.

A survey of studies on perceptual acoustic cues suggests that the residual hearing both in terms of intensity and frequency range influences reception of speech. The studies also indicate that the voicing and nasality features are least affected and the place features are more affected in typical sensorineural listeners. In typical sensorineural impairment, hearing loss increases with increase in frequency. Thus, place features are often not received through audition in the profoundly hearing impaired. One can compensate to same extent by using hearing aids that provide amplification in the high frequency range. But, children may have low levels of tolerance for high frequency place cues audible to the child. Speaking loudly also will not be fruitful. This is because, when one speaks loudly or quietly, the mid and high frequency place cues are produced at much the same intensity. By taking quietly and also close to the microphone of the hearing aid, one can emphasize place cues (Ling, 1976). Thus, more children with hearing impairment having residual hearing in low frequencies can hear the acoustic cues of vocal duration, vocal intensity, vocal pitch, the F 1 of all vowels, and consonant manner distinctions according to Ling. Therefore, manner cues are more frequently accessible to the hearing impaired than the place cues. Vision is often supplemented to compensate for the missing place cues by these children.

In our present investigation, we found that the difference in quality between a vowel and a consonant is generally always recognized. The vowels are confused only among themselves even as the consonants are confused only among themselves. We also found that the deaf subjects had greater difficulty in identifying consonants. Almost three-fourths of the total number of confusions were found within the group of consonants. The Kannada y, a semi-vowel, is always confused with other consonants, and not with i. All our subjects, irrespective of the degree of severity of hearing loss, showed confusion between a voiced consonant and its unvoiced counterpart. Our subjects did not perceive the aspirated sounds at all. Often our subjects identified individual sounds not directly as the particular sound, but through a combination of several traces which pointed out to a group, rather than a single sound. A significant feature that emerged in our data was that often the deaf subject identified the sounds as forming separate bundles – they tend to put together the sounds which resemble one another in some crucial variable. For example, all the stops were treated more or less as an independent group. Any substitution of a stop consonant was made using another stop consonant, and not by using a fricative, etc. This could not be clearly stated, in our data, for some sounds, namely, the sibilants and glottal fricative h. Sibilants were found to be substituted by stops, and other fricatives, without reference to both place and manner of articulation. While the bilabial p was correctly used, more or less on all occasions, we noticed a confusion between the velar and dental voiceless stop k and t. There was also confusion noticed between c, a palatal affricate, and ṭ/ḍa retroflex stop; presumably, the overall place of articulation was identified, but the manner could not be controlled. Another important feature that we noticed was the lack of a pattern congruity between the voiceless and voiced counterparts of the stop series, a contrast phonemically maintained in the normal hearing population. There were two types of confusions noticed, one in which there is no consistency whatsoever given to the recognition of voiced stop consonants forming part of the phonemes of the speech, and the other in which the voiced stop consonants were used indiscriminately as substitutions of voiceless stops. In other words, the clearly marked distinction between voiced and unvoiced stops in the normal hearing Kannada subjects is not maintained in the speech perception of the hearing impaired.

In the case of the laterals and the nasals, a distinction between the two is maintained, but there are frequent confusions between the two. Also, there was confusion noticed between the sounds of the lateral group. That is, the alveolar versus retroflex distinction is laterals maintained in the normal hearing is not consistently maintained in the speech perception of the hearing impaired. Likewise, the hearing impaired showed some confusion between the five nasals kept apart in the speech perception of the normal hearing populations. However, in most cases, in the initial position, the bilabial nasal was correctly perceived. The alvolar, retroflex and palatal nasals were more frequently confused with the bilabials and less frequently with the velar nasal, which is also correctly perceived in many cases. The trill is often wrongly perceived and is substituted mostly by some voiceless stop consonant, t being more frequently used among them.

Among the vowels, a is hardly ever perceived wrongly; whereas there is always confusion between the two front vowels i and e. This consistency cannot be generalized to u and o, the two rounded back vowels in Kannada. Sometimes u is substituted by a and some other times by i. Similar is the case with o which is more frequently substituted by a.

One of the problems that we noticed in all our subjects was their inability to maintain normal gaps in the flow of the words between words and between syllables of a single word. While this was the case when they uttered a stretch of utterance, we noticed a winder gap between stretches of utterances than found in between the stretches of utterances of the normal hearing. Thus the temporal progression, or, rather the construction of the temporal progression of the words and utterances clearly marks the deaf speech as deviant.

1.5. Visual Speech Reception

Perception of speech via vision is generally through the process of speech reading. Speech reading is the process by which speech is understood by carefully decoding the shapes and movements generally of lips. It is commonly referred to as lip reading. In academic discussions, the term speech reading is preferred because it is not only the lips but also the movements of the tongue, jaw, face and throat, in addition to the communicational context, that play a part in visual reception of speech.

Speech reading in combination with sign or manual language may be the only form available to the profoundly hearing impaired children with no residual hearing. In India, most of the severely and profoundly hearing impaired rely on the visual modality alone for speech reception since speech and hearing training facilities are not readily accessible to them. Moreover cost of the hearing aids is too high for most of the population.

Severe constraints are faced by the hearing impaired, when they resort to speech reading. The speech sounds are distinct when received through the ear. This distinction is to be perceived in the place of articulation when speech reading is adopted. This leads to various problems. Many sounds of a language are produced at the same place of articulation. In Kannada, there are five bilabial phonemes p, ph, b, bh and m, there are five velar phonemes k, kh, g, gh and nׂ; there are five dental phonemes t, d, th, dh, and n; there are five palatal phonemes c, ch, j, jh and ñ; and there are five retroflex phonemes ṭ,ḍ, ṭh, ḍh, and ṇ. Therefore, the hearing impaired children face the problem of distinguishing between the phonemes with similar place of articulation; they also face the problem of distinguishing between the phonemes on the basis of manner of articulation. In addition, the visual patterns associated with various consonants range from being clearly visible to ambiguous positions to invisible positions. For instance, while the positions for the bilabial and labiodental sounds are easily visible, the positions for the velar and post-velar sounds are barely visible.

Speech reading is no substitute to real language, nor is it a useful stepping stone for the acquisition of real language. Speech reading does not serve as a speech feedback mechanism. Speech reading raises several questions with regard to interrelationship between modalities. How is the oral language encoded in the form of visual images, articulatory movements or vibratory patterns? How is the message received through lip reading transformed to language and speech? Speech reading also raises language specific questions. Are there culture-bound traits in speech reading? What are the influences of the dialect on speech reading? Are there universal traits in speech reading? What are the specific language constraints faced by the hearing impaired in speech reading in a particular language context? Can there be a hierarchy of elements – from the easily retrievable to the more difficult? How are the combinations of phonemes retrieved? Do the hearing impaired engaged in speech reading retrieve speech through individual sounds, phonemes, or as combinations of phonemes/combinations of sounds? Or do the hearing impaired perceive speech Vai speech reading in terms of larger units such as syllables, words or phrases? How are the visible traits of the utterance retrieved through speech reading – such as the initial phoneme/syllable – related to the full form in the normal language?

Following the Bloomfieldian and neo-Bloomfieldian models of linguistics, studies on visual perception of speech have attempted generally to identify the visual reception characteristics for individual segmental sounds in various positions of their occurrence and for supra-segmentals. Attempts have been made to group the phonemes of a language into contrastive units based on visibility. The visibility of consonants and the distinctness of consonants of English and other European languages have been studied extensively. Several studies have identified the visemes (compare this nation with the nation of phoneme in normal language description). These include Woodward and Baber (1960).

These studies aimed at ranking various speech sounds in normal language in terms of a hierarchy of visibility – from the highest to the lowest visibility. Note, however, that it was not possible for any of these studies to present an absolute ranking order, sound by sound, for each sound. The maximum that these studies could do is to arrive at a hierarchy for groups of sounds. For example, all labiodental sounds available in the language were found grouped as a single group; likewise all bilabial sounds in the language were also found to fall into another group, and so on. That is, the investigators found that the sounds fall into various groups in terms of their potential for visibility and that all the members within a group had, more or less, the same potential for visibility. In other words, a speech reader is to find means to decide as to which of the bilabial sounds in indeed referred to by the speaker as failling within the bilabial sounds, at the moment of the speech reader’s reading an utterance. Another feature was that while there were some agreements between the investigators cited above as regards the visibility potential of certain groups of sounds and consequent hierarchy, for example, the bilabials were seen by all as having the highest potential for visibility, there were no agreements among the investigators for other groups of ‘homophonous’ sounds. In essence, there is some disagreement as to the inter se hierarchy of various groups of sounds with regard to their potential for visibility. The hearing impaired persons perceive usually speech by watching the speaker’s face. They do it even while they listen through a hearing aid. Normal hearing persons also rely on visual cues, especially when they communicate in noisy or reverberant environments. It should be noted, however, that only a limited amount of information on consonantal identifications is lip-provided.

As already pointed out, most of these studies used normal hearing subjects, and videotapes for the presentation of the test material. They used tests of forced error confusions. The tests aimed at a visibility chart for the phonemes of the language. In some experiments, words of minimal pair types were presented, without any linguistic and meaningful contexts. The results of most of these experiments show that the consonants of a language can be grouped under various categories of contrastive elements. The bilabial category is found in all the experiments as distinct from the non-bilabial. The labiodental category is also seen distinct. In general, the distinction between categories is maintained at the level of point of articulation, rather than at the level of manner of articulation. The broad contrastive categories are of an overlapping nature in the sense that in some of the contrastive categories no clear-cut distinction based on point of articulation is maintained. However, the hypothesis that the correct perception of visemes (phonemes grouped into contrastive units based on visibility) within the same category is no more than a chance was tested by Lathe (1978) in Indian languages context. Two groups of homophonous categories, bilabials p, b, m and the alveodentals t, d, u, in monosyllabic structure were presented to 10 normal hearing adults by means of a videotape. The results of the study rejected the hypothesis. It was suggested that some cues not known at present were available to the subjects. Further exploration of the cues such as pressure in lip approximation for the bilabials, tongue contact for alveolars, and the differences in the time of articulation of different sounds, and the cues available from laryngeal region were suggested. Note that elsewhere also similar findings have been reported. The deaf subjects tend to perceive a speaker’s laryngeal vibrations and use this information as a supplement to reading. Schienberg (1980) also found that the ‘homophemes’ can be discriminated at a better than chance level.

The consonant clusters reveal several performance strategies in individual languages. In many languages, clusters get simplified through various processes. Some clusters are more frequent than others. They may become homogranically identical or non-identical clusters, heterogeneous or single, tense consonants. There may be double, triple or even quadruple consonant clusters. The clusters in borrowed words may or may not be retained and may have their own behavioural norms. There are certain restrictions imposed on each consonant as to with which other consonants, in what sequences, and in what positions of a word, it can form a cluster. Frequency and variety of clusters which occur in various positions in a word differ from one positions to another. In many languages, the consonant clusters are produced with quick articulatory successions, and, in several others, one could notice a brief hiatus between the elements that form the cluster. Each of these and several others have behavioural consequences in the speech reading of the hearing impaired.

There is a general agreement among the investigators that the consonant clusters are perceived incorrectly most of the time. Very often consonant clusters are perceived as single consonants. It is possible to establish, only to a lesser degree in the case of the clusters than in the case of single consonants, visually contrastive categories based on articulatory positions or movements for each language. The present authors find that, unlike in the visual categorization of single consonants, the visual categorization of consonant clusters exploits also the manner of articulation.

Investigators have found out that vowels tend to be confused more often with other vowels produced in neighbouring articulatory positions. While the back and lip rounding vowels o and u are the most visible, the front and lip unrounded vowels e and i have been identified as the most audible. Ling (1976) has suggested that because the hearing impaired children rely mainly on visual cues, they produce vowels often with appropriate lip shapping but with neutral tongue position.

In many Indian languages, a distinction between short and long vowels is generally maintained at the phonemic level. Some neutralization between long vowel and its corresponding short vowel takes place in the word final position and also at times in the word medial positions. The distinction between short and long vowels is phonemic and as such needs to be acquired by the hearing impaired. Word initial vowels are more prominent than others. In some Indian languages, there are proper vowel clusters in the sense that all the vowels in a cluster maintain their original vowel quality. In several other Indian languages, the combination of one vowel with another vowel may result in diphthongs. A visually contrastive hierarchy for vowels of a particular language is to be worked out taking into consideration the above information. Because length is phonemic, lip shapes well as duration (length of duration in which in which a particular lip shape remains as it is) should be considered as important visual cues. Including duration as an important visual cue may also help us to relate tones in some of the Indian Languages, like Punjabi. Research in these areas is yet to begin however.

Studies on other languages reveal two conflicting trends: Some investigators, like Ling (1976), have taken the position that since there are no visible articulatory movements corresponding to pitch and intensity, vision does not help the perception of these features. But some other studies, such as that of Berger (1972), have argued that the duration correlate may be contributing to the correct perception of stress in the hearing impaired. The present authors are of the opinion that the investigators have generally tended to seek the determining visual cues for the suprasegmentals. It is through a combination of visual cues that the hearing impaired begins to clearly decode the suprasegmentals and the words and phrases. A dynamic view of the process, with one cue influencing and triggering the other, is called for. In general, the visual perception of prosodic features, in the opinion of the present authors, holds the key for the proper decoding of words and phrases. Instead of working from the consonants and vowels, it may be more appropriate to work out the visual cues for prosodic features first and place, within the patterns of visual cues, the combinatory processes of visual cues for the consonants and vowels. A normal child acquires the prosodies first, before she acquires the lower level units such as words, consonants and vowels. If the acquisition of prosodic features through the auditory modality is denied, in various degrees, in the hearing impaired, it is but natural that the young child resorts to the visual cues to identify the prosodic features.

The modality of vision is already in operation for communication purposes even before language acquisition processes commence. The present authors find that the hearing impaired children assign more functional values to visual cues than normal children do. The prosodic features of the auditory modality also have their closer association with visual cues in the early stages. With the commencement of oral language acquisition, the prosodic features of the auditory modality take a definite shape and play a crucial role in language acquisition. Since this is denied to the hearing impaired, in various degrees, the prosodic features of the visual mode is to be more elaborated. The literature has not focused upon this elaboration and subsequent embedding of visual cues for words, consonants and vowels into the elaborate mosaic of language communication.

We have already argued that investigations on the inter-relationship between visual cues and individual phonemes and their combinations are not adequate. We have suggested that units larger than the above also should be brought into the scope of investigation. We have further suggested that we start with the investigation of visual cues for the prosodic features and then place within these features the features for other linguistic units. There are several investigations which have focused upon the visual cues for larger units such as words, phrases and sentences. We give below the salient findings and trends in this area of research.

1) The lip reading performance is affected by the number of words in a sentence, the number of syllables in a sentence, and the number of vowels and consonants as well as the length of the stimulus words.

2) Lip reading is not done one phoneme at a time but by semantic units. The speech reader recognizes the movement seen as something known or unknown (Clouser, 1977)

3) Lip reading of sentences is influenced by the familiarity of sentences and by the familiarity of words that constitute the sentence.

4) Clouser (1977) reports that the lip reading of sentences is not influenced by vowel consonant ratio in both hearing impaired and normal subjects.

5) Complexity of the sentences is a variable affecting lip reading. The simple kernel sentences may be less difficult than the other transformed sentences.

6) The difficulty in lip reading longer sentences which are of same complexity, as opposed to shorter sentences, has been attributed to shorter visual (iconic) memory than the auditory (echoic) memory.

7) Some authors find that interrogative sentences are more difficult to speech read because they contain an unstressed verb or verb-anciliary in unusual position in English. But it is also found that the attitudinal contents of intonation are easily conveyed by the kinesic signals such as hand or head movement or facial expression. In the case of Kannada, we observed that the interrogative sentences beginning with interrogative morphemes (wh-morphemes) are more difficult to speech read than the interrogative sentences which end with interrogative suffix. The interrogative suffix –a: is added to a declarative sentence to make the latter a sentence conveying interrogation:

avanu ho:gutta:ne ‘he goes’
he goes

avanu ho:gutta:na: ‘does he go?’
he goes - does he

Since an open central vowel is more easily perceived by the hearing impaired, and since there are clearly visual facial movements and expressions, perhaps the hearing impaired is able to perceive the second category of interrogative sentences more frequently.

8) Only a very small percentage (less than 10%) of the discourse material can be speech read by normal subjects. This show the difficulty level in speech reading the discourses. Information in, rather, discrete and distinct units, presented in more than one modality and in a more redundant and repetitive manner, will facilitate the progress of discourse communication, with the hearing impaired.

9) Appropriate discrete gestures, and even some appropriate continuous gestures in conversational situations, are factors that enhance speech reading performance. Inappropriate gestures in these contexts decrease the visual speech reception scores. The gestures used in the study of Popelka and Berger (1971), ranged from movements that were close to the speaker’s lips to movements that were fairly distant from the lip, and from subtle movement to considerable motion. It was concluded that the peripheral vision of the normally seeing persons allows the simultaneous and accurate perception of both types. Gestures ere thought to be operating in two ways: (i) Delimiting the individual word choices within the message and (ii) In an idiomatic manner, whereby the gestures lead to the expectation of a group of words of specific word order.

10) Familiarity of the language is an independent variable that influences the interpretation of visual cues (Albright, et al., 1973). When phonemically similar utterances from two languages are presented to a hearing impaired person, he will tend to show a superior performance with regard to the language known to him. Indian languages offer an excellent ground for research in this area. There are four different language families. Languages within a family are classified under various groups based on similarities and differences in all linguistic structure, and shared innovation. Tamil and Malayalam, Tamil and Kannada, and Kannada and Telugu would be an excellent target for study in this area. These languages offer phonemic similarity as well as syntactic similarity. They also offer lexical similarity. It will be interesting to see how these similarities are inter-linked in the language of the hearing impaired.

In addition to the linguistic factors discussed so far, a number of other variables interact to affect the speech reading performance. These include the speaker variables, the channel variables, and the speech reader variables. All these interact with one another to influence the speech reading performance. Some of the salient points are as follows:

1) Sex of the speaker does not affect the speech reading performance.
2) The greater the portion of the face of the speaker exposed to the speech reader the better is the speech reading performance. There is superior performance in lip reading when the whole face is exposed, in comparison to the performance when only the lips are exposed.
3) The idiosyncracies of the mode of speaking of the sepaker also act as another set of variables. There is a tendency amongst the normal speakers to exaggerate their speech movements when they converse with the hearing impaired. Stone, et al., have found that the normal mouth movements are preferable to the lips (Stone, et al., 1951).
4) Rate of speech of the speaker is another variable. Reducing the rate of speech from the normal rate does not vary the performance of both ‘good’ and ‘poor’ deaf lip readers. However, the speed of speech greater than the normal rate of speech may affect the speech reading performance.
5) The way the speaker speaks is not the same in all the situations. The rate of speech may change when the speaker is emotional; it may increase under certain emotional conditions and it may decrease and be of a faltering type under several other conditions. Chewing and eating situations, wherein the mouth is full, may also influence the speed of delivery. Certain contents require slower/speedier delivery. Some may cover their mouths in many ways and speak. Still others may speak with their hands or fingers placed beneath their chin. The lip reading performance is superior when the speaker’s face is clearly seen without any obstacles.
6) Speech reading of familiar and unfamiliar speakers must be distinguished. Speech reading a familiar speaker is like the deciphering of children’s speech in early stages of acquisition by the members of the children’s family. All constructions are interpreted within a communication context, shared by the child and the members of the family. Speech reading a familiar speaker is based on accumulated visual cues already in the store of the hearing impaired. There is already a sharing of the communicative context between the speaker and the hearing impaired, whereas for the interpretation of the unfamiliar speaker, new bridges of communication are to be built.
7) In the normal hearing population, visual cues play an important role in face to face communication. We have observed in our study that when a subject is seated with body, head, and eyes oriented in the same direction, speech coming from the front is better perceived than speech coming from other directions. This is not based purely on the likelihood of directly receiving and perceiving the acoustic energy emitted, but more on the possibility of contributions by the visual cues to the linguistic communication. In the case of the hearing impaired, frontal position enhances the reception potential for acoustic energy even as it enables the hearing impaired to make approximations towards speech uttered via the visual cues. It may be pointed out that, even in the normal hearing also, recognition of speech sounds in word contexts is better when presented in the combination of auditory-visual modes than only through the auditory mode.

1.6. Tactual Speech Reception

The tactual modality helps the profoundly hearing impaired and the blind child in their speech reception. Taction also supplements visual speech reception in the profoundly hearing impaired children. The presence or absence of voice, its relative duration and intensity, the pitch of the speaker’s voice are the identifiable by touching the speaker’s chest and/or face properly. The vibration on the bridge of the nose gives information about the nasals; information on plosives is seen in the sudden release of air; the voiceless and aspirated stops are identified with relatively greater airflow than the voiced stops. The table below gives the details.

With fingers, finger-tips and/or palm, taction atCategory of speech perceived
1. Speaker’s chest or faceThe presence or absence of voice, its relative duration, its intensity and pitch (Ling, 1976).
2. Vibration at the side of Throat Presence of vowels, diphthongs, and voiced consonants.
3. Hyoid depressedProduction of velar consonants k and g.
4. Hyoid bulgingProduction of vowels i and u.
5. Emission of pulse of air from the oral cavityPlosives, stops, and affricates.
6. Emission of shart or diffuse flow of air from the oral cavityFricatives sharp s, diffuse š, ş.
7. Jaw and lip movementCues for position of mandible and lip for various vowels, diphthongs and consonants.
8. Vibration at the side of the noseNasalized vowels and nasals consonants.
9. Emission of air from the nasal cavityNasal consonants and nasalized vowels.

Theoretically speaking, it is possible to identify the likely tactile cues for the speech sounds of a language based on the articulatory description of speech sounds of the normal hearing. This is what we present in the following table. However, it should be borne in mind that these are only likely tactile cues and that, in reality, it is open to every hearing impaired child to devise his own tactile cues for the speech sounds, which need not be based only on the articulatory characteristics of the speech sounds. Moreover, a hearing impaired child will not develop tactile cues for all the speech sounds, only some are chosen and tactile cues developed. Our subjects further showed that, even when some tactile cues are prominent, such as the heavy puff of release marking the production of aspirated stops in Kannada, they may either ignore using these cues or associate the same with some other closely resembling sounds such as unvoiced stop consonants in Kannada, if aspiration does not play a crucial role in the language. It is not clear, however, how the hearing impaired arrive at this rather correct but surprising conclusion. Perhaps the residual hearing helps them in some manner in this case.

Tactile Cues for Kannada Speech Sounds

P Emission of explosive pulse of air from the oral cavity.
b Emission of explosive pulse of air from the oral cavity. Vibration at the side of the throat. Mandible from high position depending on the following vowel.
t Emission of explosive pulse of air from the oral cavity.
d Emission of explosive pulse of air from the oral cavity. Vibration at the side of the throat. Mandible from high to low position depending on the following vowel.
Emission of explosive pulse of air from the oral cavity.
Emission of explosive pulse of air from the oral cavity. Vibration at the side of the throat. Mandible from high to low position depending on the following vowel.
č Emission of diffuse pulse of air from the oral cavity.
Emission of diffuse pulse of the air from the oral cavity. Vibration at the side of the throat. Vibration of front of tongue.
k Emission of pulse of air from the oral cavity. Mandible depressed. Rear position of the chin/floor of the oral cavity raised and lowered.
g Emission of pulse of air from the oral cavity. Mandible depressed. Vibration at the side of the throat. Rear portion of the chin raised and lowered.
ph Same as for p. Pulse of air is of greater intensity.
bh Same as for b. Pulse of air is of greater intensity.
th Same as for t. Pulse of air is of greater intensity.
dh Same as for d. Pulse of air is of greater intensity.
ṭh Same as for ṭ . Pulse of air is of greater intensity.
ḍh Same as for ḍ. Pulse of air is of greater intensity.
kh Same as for k. Pulse of air is of greater intensity.
gh Same as for g. Pulse of air is of greater intensity.
m Vibration felt at the nasal bridge. Emission of breath from the nasal cavity. Vibration at the side of the throat. Mandible from high position to low position depending on the following vowel.
n Same as above.
n Same as above.
Same as above.
ñ Same as above.
Same as above. Mandible depressed. Rear portion of the chin raised and lowered.
f Emission of air between the upper teeth and the lower lip.
s Emission of sharp flow of air centrally between the teeth.
š Emission of diffuse flow of air between the teeth.
Same as above.
h Emission of air from the oral cavity.
l Vibration of the side of the throat; mandible from high position to low position depending on the following vowel.
Same as above.
r Same as above.
v Vibration at the side of the throat. Mandible depressed.
y Vibration at the side of the throat. Mandible raised or lowered depending on the following vowel.
i Vibration at the side of the throat. Mandible depressed. Hyoid bulging.
i: Same as above; but, for longer duration.
e Vibration at the side of the throat. Mandible depressed. Hyoid bulging.
e: Same as above; but, for longer duration.
ǽ Vibration at the side of the throat. Mandible depressed. Hyoid is not bulging.
u Same as above, but, hyoid is bulging.
u: Same as above; but, for longer duration.
o Same as for u.
o: Same as above; but, for longer duration.
a Vibration at the side of the throat. Mandible depressed.
a: Same as above; but, for longer duration.

The table above presents tactile cues for the phonemes of an Indian language (Kannada); the cues are based on a blend of empirical observations and our own intuitive analysis of the situations encountered. It may be noted, however, that there are severe constraints in the indentification and use of tactile cues, from the investigators’ point of view, and from the point of view of the hearing impaired. First of all, tactual perception of speech is more effective if multiple sites are exploited. Secoundly, such cues need to be very intelligently linked with some mental token (concept). While the cues are invariably syncretic, there is a need to identify distinguishing features even among the syncretic cues. Moreover, the sequential organization of these cues for a progression and expression of the communicative intent is also necessary. Studies have demonstrated that taction is a viable modality to supplement a deficient auditory mode; but the usefulness of taction for speech reception in the hearing impaired in a communicative context is rather limited. We should also emphasize that all the hearing impaired individuals do not uniformly follow the same methods of using tactile cues; nor do they uniformly choose some particular sites for taction. There is a great variety. There are several success stories in which the hearing impaired have been reported to have used tactile cues as the only means of perception. But, more often than not, taction is seen to improve the quality of speech perception through other modes such as speech reading. Plant and spens (1986) reported a case history of a 48 years old Swedish male developed a method to perceive a speaker’s laryngeal vibrations by placing his hand on the speaker’s shoulder with his thumb pressed lightly against the side of the neck. This method improved his speech reading ability of sounds, syllables, words, sentences and discourse. Used alone, the method helped the subject to perceive consonant voicing surprisingly to an extent of 99.3%. Likewise, the subject could identify the manner of articulation of consonants also to a very high degree. Perception of syllables in words and sentences, and emphatic stress in sentences were also high. Such stories of success, while emphasizing the difference in modality preference in the hearing impaired, also point our to the quality of cognition retained in spite of the deficient hearing potential.

1.7. Speech Reception Through Audition and Vision

Vision constitutes an important factor in speech reception, just as audition. Indeed, a well coordinated effort between vision and audition is necessary for speech reception. In normals, when audition is hampered by high background noise conditions, vision helps restore speech reception. Vision supplements the speech reception by the hearing impaired children. For the analysis of the inter-relationship between the two and the effect of these two on the hearing impaired, two types of studies have been carried out. In one type of studies, scholars have examined the performance of the hearing impaired under auditory, visual, auditory-visual presentations of the test materials for speech reception. In the second type of studies, investigators have tried to simulate the hearing loss by presenting background noise in the test situations to the normals. The general consensus is that the audio-visual speech reception is superior to the audition-alone condition. Features of voicing and nasality are recognized under noise conditions, whereas place of articulation is difficult to perceive. Under audio-visual conditions, the place of articulation shows substantial increase in intelligibility. Walden, Prosek and Worthington (1974) showed that auditory-visual perception of speech is superior to auditory perception alone. They classified the consonants on a six feature system: voicing, nasality, liquid glide, frication, duration and place of articulation. In the auditory reception-alone condition, the transmission of place of articulation feature is substantially less than that of other features. In the liquid glide feature that is best transmitted, voicing and nasality are perceived slightly better than that of duration and frication. Perception of duration, place of articulation, fricative and nasality features show an increase with the aid of visual component. The reception of the liquid glide shows less improvement than the above four features. Moreover the effect of visual cues on the reception of voicing information is minimal. Other studies indicate that vision has a greater effect on the reception of consonants than on the reception of vowels, words, and phrases, and that younger people with congenital disorders benefit by vision more than the older people who become hard of hearing through aging.

Most of the laboratory studies are of the opinion that auditory-visual speech perception is superior to visual or auditory perception alone. But the maximum audio-visual scores obtained are dependent on the severity of the hearing loss of the subject, room acoustics, the linguistic unit to be perceived, and age of the patient. Most studies have emphasized recognition of single and initial sounds only. Studies have generally ignored the contribution of the communication context, the context of situation. Byman (1974) emphasizes that speech perception in every day communication should be subject of investigation. The restricted laboratory environment is not the right backdrop for a generalization. Byman suggests that at the very beginning of an interaction each of the participants attempts to categorize the communication situation. This categorization, if confirmed, will activate a certain set of expectations. The participant’s knowledge of what is relevant to the topic of conversation also influences his expectations and anticipations of what the speaker is going to say next. Thus, a comprehension of the communication situation is a pre-requisite for the comprehension of speech.

1.8. Speech Reception Through Vision and Tactual Modalities

Studies on speech reception through a combination of vision and tactual modalities indicate that high frequency consonant sounds such as s and t can be differentiated when speech reading is supplemented by tough. In tests with combined lip reading and tactual reception, the tactual information on consonant and the number of syllables was found to improve transmission without detracting from lip reading the visible features of other sounds (Pickett, 1963). Since place of speech production features are speech read better than the manner of articulation features, tactile aids to supplement the latter may enhance speech reception (Ling, 1976).

Miller (1974) examined the relative effectiveness of the visual, and the combined tactile and visual modality on speech reception. The vibrotactile speech reception aid transmitted the cues from the nasal vibrations, equalized speech sounds that throat vibrations. The subject placed his 5th, 3rd, and 2nd fingers on the nose, microphone and throat vibrators. The results indicated superiority of the combined conditions. Phonetic features of voiced/voiceless, continuant/interrupted, and nasal and oral were discriminated well. Changes in vowel duration with changes in final consonants were clearly felt, while the tense/lax distinction of vowels was not clear. The listed cues together with those available by lip-reading are said to provide complete information of the segmental features of English.

In conditions of simulated profound deafness in normals, subjects have reported that a few words and phrases do not match the received patterns through lip-reading. Ambient noise conditions tend to distort the reception of the temporal patterns of speech. But, tactual vibration is found to improve the rhythm of the speech of the hearing impaired.

1.9. Speech Reception Through Auditory and Tactual Modalities

Though the preference is to provide the child with both tactual and auditory modalities by most researchers, research in this area is scanty. Boothroyd, et al., have devised a hearing aid with tactile output. Providing the child with tactile display in addition to amplification were said to bring about improvement in voice control, and syllable discriminations.

1.10. Multisensory Speech Reception

Even in normal population, communication is carried on through multisensory organs. In the speech of the normals, there is an interplay of oral speech with vision and taction. The normal multisensory speech reception includes speech reception through hearing, vision and touch. in the hearing impaired, there is a greater reliance on the multisensory speech reception. However, in order to identify the relevance and wieghtage of each of these modalities vis-a-vis one another and for the total communication process, it is necessary to undertake suitable experiments. It is also necessary to investigate as to whether the simultaneous use of all the three modalities inhabits or facilitates communication in the hearing impaired. The efficiency of the combination in all situations, while conversing at different distances and speeds, reading and writing, playing, etc., needs to be examined. A fruitful approach would be to investigate the variables of interplay of multisensory communication. In normals, the non-oral communication supplements the oral communication and, in certain specified conditions, precede/follow/accompany oral communication. The non-oral communication is seen superior to oral communication in certain contexts in all cultures. For the hearing impaired, the non-oral communication is almost as important as the oral communication. The non-oral modes not only supplements the oral mode in the hearing impaired but also acts as a means to interpret the oral mode; the non-oral mode has a greater independence, acts as a greater independent means of communication in the hearing impaired (Thirumalai, 1987).

Current research has attempted to allocate areas of aperation for each of the modalities (vision, speech and taction) in the speech reception of the hearing impaired. The research has attempted to show how certain particular cues of vision and taction are exploited to identify certain cues of speech or rather certain variables of speech such as sounds. For instance, Ling (1976) may be cited as an excellent example of this orientation. Ling (1976) suggests that the child’s attention is directed towards vision, if place of articulation is to be received, and towards audition and touch, if supra-segmental feature is to be perceived. Ling as also identified the types of speech features available, partly available, or unavailable in speech reception through each sense modality. This orientation is certainly valuable and has immense practical applications. However, the present investigators are of the opinion that an integrated orientation is not only possible but could act as a better alternative. By the integrated orientation, we mean that we could identify units of communication in the hearing impaired. These units of communication are not to be found focused only on speech. The unit is a blend on speech for its building blocks. The unit of communication in the hearing impaired is an integrated whole of speech, vision and taction. Only by taking this view will we be able to explain the communication abilities of the hearing impaired and describe the communication process itself, a process that is carried on even with deficient speech and hearing.