Speaking
Language is the light of the
mind. -- John Stuart Mill
In Paris they simply stared when I spoke to
them in French; I never did succeed in making those idiots understand
their
own language. -- Mark Twain The Innocents Abroad
Introduction
- speech is to produce words to communicate
- words consist of phonemes
- the sounds associated with the letters of the alphabet - vowels
&
consonants
- 50 phonemes, but not every one is used in every language
- e.g., 'l' and 'r' are phonemes used in English ('lip' &
'rip'), but
in Mandarin Chinese no words are distinguished by 'l' & 'r'. 'l' is
only at word beginning & 'r', at word end
- 'l' & 'r' are phones - sounds that may or may not convey
meaning
- 'l' & 'n' are the same meaning
- e.g., Chinese speaking English will say at night, "Turn out
the night"
- e.g., English 'pit' & 'spit'. air burst for 'pit' but not
'spit'.
in
English 'p' in initial but not other word position is aspirated, but
the
difference in how the 'p' produced does not affect meaning. in Hindi
aspirated
& nonaspirated 'p' have different meanings
- but consider " uh'-uh " & " uh-huh' " or " hmmm "
- the elements comprising phones are phonetic features
- allophones are distinct phones that are not distinct
phonemes
- words vary in intonation and stress
- In English, intonation & stress convey meanings about
phrases or
sentences
but not words
- In other tone languages, e.g., Mandarin Chinese,
intonation
affects
meaning of words as well as of phrases
- timing of speech
- when we speak more quickly,
- durations of vowels are reduced more than durations of
consonants
- omit all but most critical elements needed to ensure
communication
- 'I do not know' vs. 'I dunno'
- flatten intonation
- maximum rate is 200 words/minute
- @ 7 phonemes per English word 1400 phonemes per minute
- but estimated that it takes 0.1 sec to produce single
phoneme, so we
should
be able to produce no more than 10 phonemes/sec 600 phonemes/minute
- So how do we talk so fast?
- phonemes are not produced in serial fashion. coarticulation
improves
speed of production
- e.g., anticipatory lip rounding of 'construe' or 'tulip'
- e.g., prenasalization, opening the velum in
anticipation of
nasal
consonant. e.g., /n/ in freon
- prenasalization happens in languages in which nasalization
of
consonants
carries no meaning like in English. In contrast, prenasalization does
not
occur in French.
- requirements of a theory of speech motor control
- how all aspects of production are managed
- why certain sounds and combinations are used and others are not
- 'K!' used by K!ung
tribe in Africa
- but 'snoring' sounds not used in any language
- Probably based on biomechanical efficiency or deep
grammatical rules
- But speech can be produced even when vocal apparatus in
different
physical
configuration
- talking with something held between teeth, e.g., pipe or
pencil
- interaction of speech and hearing
- how other's speech discriminated given phonetic features are
obscure
- how our own speech is regulated as we hear ourself
- experiment - delay auditory feedback
- affects frequency & amplitude of vibrato in singers if
delayed as
little
as 100 ms
- disrupts speaking if delayed as little as 100 ms, can
result in
stuttering
- learning to speak correctly requires hearing speech
- deafness after learning to speak results in disrupted
speech after a period
The vocal tract and articulatory dynamics
- Resonance of air column in vocal tract produces compressions
&
rarefactions
that are sound. Varying the shape of the vocal tract produces different
sounds
- 3 subsystems: respiratory, laryngeal & articulatory
- The respiratory system
- lungs
- diaphragm
- intercostal muscles
- external - inhalation
- internal - exhalation
- breathing controlled differently during speaking or singing
- skilled singers can control amount & rate in inhalation
&
exhalation
- precise regulation of intercostal muscles in relation to lung
expansion
& contraction
- sustained activation components
- transient activation components
- not related to syllables as once thought
- air pushed through larynx...
- Laryngeal mechanisms
- larynx - 4 main functions
- regulates characteristic pitch of voice
- de Vinci discovered in cadavers that mass of vocal cords
affects vocal
pitch; larger vibrate at lower frequency. enlargement of vocal cords
during
male puberty
- rapid pitch changes effected by changing the stiffness of
the vocal
cords.
- stiffer cords vibrate @ higher frequency
- modulates aspiration (e.g., 'p')
- allows for whispering
- creates voicing, subtle buzzing sound
- the difference between 'f' & 'v'
- vocal cords are 2 folds lying across the roof of larynx
- narrow distance between vocal cords produces voicing, e.g.,
'v', 'b'.
wide
distance eliminates voicing, e.g., 'f', 'p'.
- shape of larynx accounts in part for better singing
- ability to lower larynx created enlarged cavity that
produces extra formant
(additional concentration of energy in auditory frequency range) so
that
voice heard better
- Articulatory mechanisms
- rich variation in speech sounds accomplished by structures
above the
larynx:
pharynx, mouth, jaws, lips, nasal tract and velum
- The pharynx
- pharynx shape produces different vowels
- vowels produced because mature humans have long necks, low
larynxes and
large, mobile throats
- human infants, pre-human hominids and apes have short necks,
high
larynx
and small immobile throats
- cannot produce vowels
- but also cannot choke!
- Vowels
- all vowels voiced
- variations in tongue position produce different vowels
- lax vowels produced with root of tongue retracted
- tense vowels have the root of tongue advanced, e.g.,
'teen'
or 'boot'
- Consonants
- tongue placement important
- also positions and movements of lips, jaw & velum
- categorize according to manner and place of
articulation
- manner of articulation - way the air stream is
constricted by articulators
- stops - pug [partial listing]
- constrictives - forth
- nasals - mine
- liquids - reel
- glides - wet
- affricates - crutch
- place of articulation - location where constriction
occurs
- bilabials - m, b [partial listing]
- labiodentals - f, v
- dentals - then
- alveolars - d, s
- palatals - rich
- velars - kid
- glottals - button
- 'standard theory' of phonology (Chomsky & Halle, 1968)
- each phoneme is coded according to its distinct features
- implies that distinct speech sounds produced by articulators
achieving
specific positions
- problem, though, is that articulators exhibit wide variation in
position
and movement although continuing to produce speech
Variability and the motor theory of speech perception
- one line of evidence for variability of speech is from analysis
of spectrograms,
a plot of sound frequency versus time.
- but spectrograms of the same phoneme can vary, e.g., b
in 'big'
and b in 'big'
- so how can we perceive the same phoneme from different heard
stimuli?
- motor hypothesis of speech perception
- speech sounds heard as such because we invoke the commands
needed to
produce
them. acoustic invariance arises from articulatory invariance
- overt muscle activation is not needed
- studies investigated articulatory invariance, measuring EMG of
speech
musculature
- so how does comprehensible speech arise out of such a variable
system?
- target hypothesis
- MacNeilage, 1970
- feedback from articulators is used via system to bring
articulators to
specific target positions. targets defined as spatial relations of
articulators
- because it is based on feedback control, starting position is
not
relevant,
- and the system can operate despite physical perturbation of
the
articulators
- evidence
- a patient with normal hearing and motor control but no
proprioception
could
not produce clear speech
- experimental disruption of oral feedback impairs speech
quality
[talking after visiting the dentist!]
- compensation for disturbances of articulators
- talking with clenched teeth requires altered tongue and lip
movements
- if particular articulator positions and motions were
required for
speech,
then this should not be possible
- stability of articulatory targets in spite of variability in
articulatory
starting positions
- with repetitions of an utterance, there was less variation
in final jaw
positions than in initial jaw positions
- velocity of lower lip approaching upper lip in producing
/p/, /b/ or
/m/
increased with initial lip opening following preceding vowel
- counter-evidence
- absolute muscle lengths needed to produce speech with
clenched teeth
are
actually different from muscle lengths that are normally required.
because
the system modulates muscle length, it cannot alone produce effective
compensation
- experiment - with block in mouth, Ss produced consonants that
required
lip closure. Target hypothesis predicts increased muscle tension to
counteract
enforced separation. Observed decreased muscle tension!
- quite different articulatory configurations can be used to
produce
vowels
with similar acoustic characteristics. Target hypothesis does not
predict
such wide variation.
- relative positions and acoustic targets
- the realization that the purpose of speech is to generate
sounds led to
the idea that speech designed to achieve acoustic rather than
spatial
targets
- central insight is that proper acoustic results realized when
articulators
achieve proper relative positions, and the absolute
positions
are less important
- Experiment - measure upper & lower lip position when
speakers say
[apa]
- control of movement to relative target positions happens
rapidly and
can
compensate for perturbations
- this finding reflect motor system's tendency to achieve motor
equivalence
- capacity to achieve same result with different motor components
- similar findings for finger and arm movements
- A mechanism for relative positioning
- A parallel distributed processing system for coarticulation
- High-level control of speech
- what control mechanisms regulate serial ordering of words,
phrases &
sentences?
- Word games
- Pig Latin
- this ability reveals that speakers are sensitive to
phonemes clusters,
e.g., 'scram' 'amscray'
- Backward talking
- not like playing the record backwards; instead reverse the
sounds
within
individual syllables
- 'I can talk backward"
- 'I nac kawt cabdraw'
- produce syllables in forward direction
- 'Cabdraw kawt nac I'
- produce syllable in backward direction
- reversed phonemes rarely cross syllable boundaries
- backward draw-kcab and not drawk-ab
- indicates psychological reality of syllables
- syllabic stress remains in same temporal order even when
syllables
reversed
- 'con'-trast' tsart'-noc and not tsart-noc'
- Thus, stress patterns autonomous
- Laboratory studies of speaking speed
- Investigate hierarchical programming of syllable sequences
(Gordon
&
Meyer, 1987)
- Task
- Ss learned to associate different 4 syllable utterances
with different
signals
- 3 different sequences
- primary sequence: bee bay bah boo
- hierarchically congruent: bah boo bee bay
- hierarchically incongruent: bah bay boo bee
- on each trial 1 signal presented as warning and second
signal presented
as trigger signal. On most trials the 2nd signal matched the 1st
- measure choice RT
- Results
- RT shortest when 2nd signal matched 1st
- Ss were preparing to utter specified sequence
- when 2nd signal didn't match 1st, RT shorter when
utterance had the
same
hierarchical syllable organization
- Interpretation
- hierarchically congruent syllables are faster than
hierarchically
incongruent
sequences indicates that syllables and syllable pairs are functional
units
- makes it easier to modify words based on linguistic
context, e.g.,
adding
or deleting prefix or suffix
- makes programming easier
- Investigate programming of syllables sequences (Rosenbaum et
al., 1987)
- Task
- Ss uttered 1 of 2 sequences depending on identity of
auditory signal
- one condition: single syllable, e.g., 'gee' & 'goo'
- another condition: 2 syllables, e.g., 'geebee' &
'gooboo'
- another condition: 3 syllables, e.g., 'geebeedee' &
gooboodoo'
['doo-bee-doo-bee-doo'...]
- the 2 auditory signals were compatible with the vowels in
one of the
sequences,
high-pitch tone 'ee' & low-pitched tone 'oo'
- Results
- RT increased with number of syllables in sequence
- RT shorter when auditory signal was compatible with its
distinguishing
vowel
- Interpretation
- vowel characterizing the entire sequence could be
specified in a single
processing stage. consistent with hierarchical programming of sequence
with vowel assignment made for all syllables once at the top of the
hierarchy
Speech errors
- "unintended, nonhabitual deviation from a speech plan"
- errors reveal something about how speech programmed; they are
derived
from
mistakes in otherwise normal workings of speech planning and production
- early analysis by Sigmund Freud
- Freudian slip
- Ss supposed to say "bine foddy" when tested by provocatively
attired
experimenter
instead says "fine body"
- subconscious urges not only source of errors, though
- verbal slips contain fewer nonwords than words, so some sort of
editing
may take place
- Baars et al. (1975) experiment
- Ss read word list silently, then spoke word pair prior to
'respond'
signal
- when 'darn bore' 'barn door', phonological bias created by
earlier
words
influenced production
- however, Ss less likely to 'dart board' 'bart doard' because
phonological
bias could not override bias against nonwords
- types of speech errors
- different kinds of linguistic units are involved
- words
- morphemes - the meaning-bearing parts of words, such as 's'
for plural
- phonemes
- consonant clusters
- vowel-consonant pairs
- different kinds of disruptions
- misorderings
- anticipations
- perseverations
- shifts
- deletions
- noncontextual errors
- exact source of error hard to identify, e.g., blends
- exchanges of same linguistic units
- word exchanges
- phoneme exchanges
- models of speech production tested by accounting for regularities
of
speech
errors
- basic assumption that there are levels of linguistic
representation
leading
to sentences
- semantic level
- syntactic
- word order and grammatical relations between words
- morphological
- details of word formation, e.g., presence of prefix or
suffix
- phonological
- phonetic
- motor
- physical articulation to produce sounds
- some models have interactions flow from high to lower level.
other
models
have interactions running both directions.
- some models rule-based; others operate according to activation
of units
in a network
Brain mechanisms underlying speech
- Neuronal basis of language historically tied to the pendulum of
the localization
of function debate
- Pro-localization, early 1800s - Gall, Bouillard
- Anti-localization, mid 1800s - Flourens
- Pro-localization, late 1800s - Broca, Wernicke
- Anti-localization, 1920-1960 - Hughlings-Jackson, Freud, Head
- Pro-localization, 1960 to present - Geschwind, Sperry, Damasio
- Brain regions specifically related to language
- Cerebral cortex
- Broca's area
- Wernicke's area
- Evidence for localization
- Effects of electrical stimulation of left-hemisphere
- Positive - Vowel burst, "Oooohh..."
- Negative - Interruption, prevention or disruption of speech
- Local cerebral blood flow (PET scans)
- Ablation
- Aphasia syndromes
- Fluent
- Wernicke's (sensory)
- Inability to comprehend words or to arrange sounds into
coherent speech.
- Damage to Wernicke's area
- Transcortical sensory (isolation syndrome)
- Can repeat and understand words and name objects but cannot
speak
spontaneously,
or are also unable to comprehend words even though they can repeat
them.
Cannot read or write.
- Damage to parieto-temporo-occipital junction
- Conduction
- Speech, naming objects, comprehension intact, but cannot
repeat words.
- Damage to arcuate fasciculus
- Anomic ("amnesic aphasia")
- Speech comprehension, production and repetition not
impaired, but
difficulty
naming objects.
- Damage to temporo-occipital border
- Nonfluent
- Broca's
- Difficulty speaking. Comprehension preserved.
- Damage to Broca's area
- Transcortical motor
- Deficit in spontaneous speech. Repetition, reading and
naming objects
not
impaired. Writing impaired.
- Damage anterior to Broca's area
- Global
- Loss of production & comprehension
- Damage to both Broca's and Wernicke's areas
- Aprosody
- Deficit in production or comprehension of intonation,
affective
component
of speech
- Damage to right hemisphere
- Pure
- Alexia without agraphia
- Normal speech production. Can write but cannot read
- Agraphia
- Normal speech production. Cannot write
- Word deafness
- Normal speech production. Poor comprehension or repetition of
certain words.
- Wernicke-Geschwind model of language
- Input: visual cortex, auditory cortex
- Sensory Processing: angular gyrus, Wernicke's area
- Transmission: arcuate fasciculus
- Motor Processing: Broca's area
- Ouput: Face area of primary motor cortex
- But the model is too localized