Exam 4 Speech Science

Start Studying!

Terms

undefined, object

copy deck

adjacent speech sounds that require the same articulator use a single articulatory gesture for both sounds: assimilation
"I miss you", the /s/ and /j/ phonemes both require particular articulations of the tongue tip and blade, so the /s/ is often produced with the palatal gesture of the /j/, resulting in a /ï“/ sound (and similar sequenc: example of assimilation
adjacent speech sounds that use different articulators can be overlapped in production (two articulations simultaneously): coarticulation
For example, the /s/ does not require the use of the lips, so lip rounding in an adjacent sound (like /u/) can begin during the /s/ (compare "seat" and "suit", similarly for "tea" and "two") â€“: example of coarticulation
Variation in the degree to which the articulators reach their "ideal articulatory goals" is referred to as degrees of: hyperarticulation and hypoarticulation
very careful pronunciation: hyperarticulation
pronunciation that undershoots the target: hypoarticulation
do not achieve the extreme articulations that they would if produced in isolation as in rapid speech: corner vowels
Assimilation or coarticulation can create an acoustic result for an articulation that is different from what is produced in isolation. In many cases, the combined articulation reflects information about the two (or more) sounds that a: what happens to acoustics of context effects and their affects on formants?
As articulatory positions change, the resonating frequencies of the vocal tract change. Changing resonant frequencies in the vocal tract result in transitions in the formants of vowels and resonant consonants. Formant transitions in neighb: explain formant transitions
F1: expected to be rising following release for all stops due to lowering of the jaw/tongue. F2 & F3: differences following release expected for different places of articulation. /b/ : F2 & F3 rise due to release of lip rounding /d: explain changes in f1, f2, etc in formant transitions for b, d, and g.
Coarticulation most noticeable for sounds that are: adjacent (next to one another)
effects of coarticulation for sounds 2 or 3 phonemes away depending on: the speech rate and the particular
articulatory configuration
There can be effects of coarticulation for sounds 2 or 3 phonemes away depending on the speech rate and the particular articulatory configuration. Start of low F3 during the initial vowel in "every"): every is an example of
aspects of production that carry over more than one segment: â€œOverâ€ segmental
found when two different articulators overlap their productin: coarticulation
when producing the common phrase "is she going?", the loudest fricative noise for the /z/ phoneme in "is" is lower than expected, b/c the tongue blade constriction is father back on the palate than it would normally be. This is an exa: assimilmation
when producing the common phrase "is she going?", the loudest fricative noise for the /z/ phoneme in "is" is lower than expected, b/c the tongue blade constriction is father back on the palate than it would normally be. this effect wo: anticipatory
I am talking to a hearing-impaired adult who is having trouble understanding. I make an extra effort to be understood for the word "dog." In looking at this production, I find that the /a/ is "dog" has a higher F1 than it would normal: hyperarticulation
the ____ scale is special scale used to model the way the ear processes frequency.: Bark
most of the amplification of sound that occurs in the middle ear is due to ______.: the difference in SA b/w the tympanic membrane and the oval window
differences b/w frequiences at ___________ are easier to hear: low frequiences (20-1000 Hz)
what difference in frequenies is the eaiest for a listener to perceive?: differences b/w high vowels and low vowels
from studies of listeners' perception of acoustic properties of speech sounds, which formants are more important for vowel identification: f1 and f2 are more important than f3 for vowel identification
Stress Intonation Duration Juncture: examples of suprasegmentals
get exaggerated when talking to children/animals or in â€œclearâ€ speech.: Stress
Intonation
Duration
Juncture
as do artic. positions -->
formant diffs
Tells which syllable of a word or sentence is most important.: stress
sometimes tells whether word is a noun or verb: lexical stress
english does what to syllables?: alternation weak and strong...can tell difference between verb and noun aka lexical stress
primary, secondary, unstressed: what are the 3 levels of lexical stress?
what type of coarticulation is going on here?: "I miss you", the /s/
and /j/ phonemes both require particular
articulations of the tongue tip and blade, so the
/s/ is often produced with the palatal gesture of
the /j/, resulting in a /ï“/ sound (and similar
sequences are found with other alveolar and
palatal combinations, like in "did you")
what is going on here in the /s/ in both words?: Coarticulation- the /s/ does not require the use of
the lips, so lip rounding in an adjacent sound
(like /u/) can begin during the /s/ (compare
"seat" and "suit", similarly for "tea" and "two")
explain the formant differences b/w ba, da, and ga.: F1: expected to be rising following release
for all stops due to lowering of the
jaw/tongue.
â€¢ F2 & F3: differences following release
expected for different places of articulation.
â€“ /b/ : F2 & F3 rise due to release of lip rounding
â€“ /d/ : F2 & F3 flat of fall (point to high freq.)
â€“ /g/ : F2 & F3 move apart (point to mid freq.)
â€¢ For all â€“ greater formant movement
expected for greater articulatory movement
(varies depending on the vowel context)
special case to distinguish from a similar word: Contrastive
Stressed syllables: typically longer in duration, higher in F0, and greater intensity than the same syllable in non-primary stress position.: explain the effect of stressed syllables acoustically
Vowel reduction: many vowels reduced to schwa when in unstressed position, but you see full vowel when put in more stress position.: explain the effect of vowel reduction acoustically
what creates stress?: increase vocal effort
Tells us about a talkerâ€™s emotional state, overall meaning of a sentence, whether done talking or not.: intonation
rise-fall (declarative sentence, non yes/no question), fall (emphasis, short unemotional), rise (yes/no question, not finished): what are the three general contours?
what is important formantin contours?: F0
Different speech sounds differ in duration, even when in the same context (e.g., tense and lax vowels). (look at some). This helps talkers identify the vowel (especially in noise).: explain intrinsic duration.
vowels tend to be longer before voiced than before voiceless stops (helps because final stops often unreleased): how does duration change vowels?
relates to pronunciation depending on location of syllable boundaries: juncture
when a __________ is between two _______ you can tell what syllable/word it â€œbelongsâ€ to: when a consonant is
between two vowels you can tell what
syllable/word it â€œbelongsâ€ to
when you cannot tell where a consonant belongs to?: ambisyllabic
â€œpattyâ€ vs. â€œpartyâ€: example of ambisyllabic
How is sound produced differently to show where word juncture is?: It sprays, worth less, how to wreck a nice
beach.
pinna to tympanic membrane: outer ear
Protection, resonator, and localization: function of outer ear
tympanic membrane to oval window (including 3 ossicles): Middle Ear
-Conversion of sound from pressure variations to mechanical vibrations; amplification (lever action and decrease in surface area). â€“ Acoustic reflex (stapedius m.); pressure equalizing.: function of middle ear
fluid filled space (coiled) with access to middle ear via oval and round windows: inner ear aka cochlea
Pressure variations in fluid cause vibration of basilar membrane (more depending on frequency â€“ basal end --> high frequency; apex end --> low frequency).: explain the effect of pressure of fluid in the ear
contains hair cells and support cells: orgin of corti
contact b/w tectorial membrane and hair cells causes nerve fiber stimulation: what does contact with tectorial membrane cause?
Different hair cells (& nerve fibers) for different frequencies, depending on place (also in cortex): explain about hair cells and frequencies
when is hearing is less sensitive to small changes in frequency or amplitude?: at higher frequencies or amplitudes
Hearing becomes habituated to a steady sound, and is more sensitive to dynamic (changing, varying) sounds.: how is hearing affected by steady and dynamic sounds?
Frequency and amplitude scales for hearing are (approximately) __________: logarithmic
The higher the frequency or amplitude, what to make it audible?: The higher the frequency or amplitude, the
larger a change in frequency or amplitude needs
to be in order to be audible
Special frequency scale: bark
Special amplitude scale: db
how to spectrograms display amplitude and frequency?: Spectrograms do display amplitude in dB, but
usually do not display frequency in Bark.
Non-linear frequency in hearing comes in part from the structure of the: basilar membrane
Range of hearing is 20-20,000 Hz â€“ About 1/3 of the basilar membrane for the lowest 1000 Hz of hearing (or 5% of range) - apex to 3rd cochlear turn â€“ Remaining 2/3 of the basilar membrane for 1000-20,000 Hz (95% of range): explain about hearing and the basilar membrane
Also, hair cells tend to be less densely distributed at the basal end.: how are hair cells distributed on basal end of baslar membrane?
While the dB scale does approximate nonlinearity in perception of amplitude, it does not reflect the differential sensitivity of the ear at different frequencies: what does and doesn't the db scale do?
the ear canal amplifies sounds in the: 3000-5000 hz
what do spectrograms do to reflect the differential sensitivity of the ear at different frequencies?: use â€œpre-emphasisâ€, raising
the amplitude by 6 dB/octave, to reflect this
sensitivity somewhat (not specific enough)
Hearing is based on the firing of auditory nerves, which can habituate: habituation
After a nerve is fired, its action potential is depleted and it is more difficult to make it fire again â€“ As a result, the neural response to a steady, unchanging stimulus diminishes in strength over time The stapedius mus: explain habituation
the most useful parts of the spectrogram: long, steady state portions like stressed
vowels and strident fricatives
know Spectrograms vs. Cochleagrams: know Spectrograms vs. Cochleagrams
Computer simulations of the function of the cochlea: cochleagram
reflects the actual output of the auditory nerve to the brain better than a spectrogram does: what do cochleagrams reflect?
which are easier to derive cochleagram or spectrogram?: spectrogram
shown researchers what acoustic features are characteristic of certain categories of sounds: acoustic cues
the regularities that have been shown to actually be used by listeners: acoustic cues
are regularities the same for everyone?: NO!
cues donâ€™t vary over a continuum in real speech (people donâ€™t produce â€œin betweenâ€ sounds): problem with studying acoustic cues
One (or more) acoustic properties are varied in steps from what is typical for one phoneme to what is typical for another phoneme: continum
syllables are presented one at a time; listeners must decide which sound they heard from among a small number of alternatives: identification
sounds are presented in pairs; listeners must decide whether sounds are same or different.: discrimination
slide 20 acoustic cues throug 22...in hearing and speech perception lecture: slide 20 acoustic cues throug 22...in hearing and speech perception lecture
Some cues were clearly found to be more important than others for some phonemes: primary cues
most potential cues have been found to be useful if others not available: secondary cues
computer processing of speech: Speech processing
applications of speech processing (programs and devices for many purposes): Speech technology
producing intelligible speech via commands to a machine: Speech synthesis
identifying phonemes or words via machine: Automatic speech recognition
take written text and convert to speech that is easily recognized by listener: purpose of speech synthesis
Text-to-speech: tts
morphology, syntax & prosody (affect how words are spoken â€“ stress, phrasing, etc.) â€¢ print to phonetic symbols (spelling rules) â€¢ phonetic symbols to acoustic productions (acoustic cues & coarticulation effects): what are the major tasks of speech synthesis?
uses sourcefilter theory of speech production to create a source sound and filters that can be changed to create desired acoustic output. â€“ Rules for individual phonemes â€“ Rules for phoneme to phoneme transitions (coarti: explain formant synthesis by rule
Small storage needs (computer program) for any number of voices (pos) â€¢ Requires a lot of background knowledge (neg) â€“ Must develop rules for each phoneme and transition to every other possible neighboring phoneme (neg) â€“ M: what are the pros and cons of formant synthesis?
uses natural speech segmented at areas of â€œless variabilityâ€ including diphone and demisyllable: Concatenative synthesis
phoneme center to phoneme center: diphone
syllable onset to nucleus or nucleus to end: demisyllable
â€“ Must store every possible combination as a separate file for each voice used (neg). â€“ Prosody may be too unvarying / breaks (neg). â€“ Hard to speed up appropriately (neg). â€“ Relatively easy to create (pos).: what are the pros and cons of concatentative synthesis?
speech synthesis application for for speech impaired (autistic, dysarthric, etc.): AAC (augmentative & alternative
communication)
why do blind like formant synthesis?: (formant
synthesis better because they like up to
600 wpm
type of speech synthesis for blind: Screen readers
Voice response systems (phone/car/etc.) ⬢ Other automated, repetitive tasks (weather reader) ⬢ Toys (Speak-n-Spell) ⬢ Create stimuli for research. are examples of: speech synthesis applications
Alternative approach to formant synthesis. Parameters are based on acoustic consequences of articulatory positions.: Articulatory synthesis
⬢ Impossible combinations are not allowed, unlike for formant synthesis (pos) ⬢ Not enough knowledge for it to work well yet (need more imaging of tongue, etc. and mapping to acoustic outputs).: what are pros and cons of articulatory synthesis?
Use of computer program to take acoustic input and identify words/phonemes.: Automatic Speech Recognition
Different from speech understanding: Automatic Speech Recognition
â€“ Digitize speech input â€“ Identify acoustic features in input (may correspond to different phonemes). â€“ Select word/phoneme with most matching features.: major steps of automatic speech recognition
Requires all our knowledge about how listeners identify speech sounds. Still not as good as human listeners.: automatic speech recognition
â€¢ Easier if words are separated slightly so that system knows where they are (human listener doesnâ€™t need this). â€¢ Variability is a big challenge: must recognize â€œsameâ€ sounds in different contexts/different talkers as: what are some Speech Recognizer Issues?
words must be separated by 500 ms or more: isolated
words must be separated by only short pauses: connected
no pauses needed â€“ accepts normal conversational speech.: continuous
another word for connected: Dragon system
200 words or less: small vocab
200-1000 words: large vocab
1000 + words to 20,000 words: vary large vocab
needs to be trained for each new talker: speaker dependent like cell phone
can recognize any talker (constrained usually by dialect, voice quality). Much harder, esp. for high accuracy.: speaker independent
phone system (needs speaker independent, but small vocab. often okay) â€“ menu systems: Voice response systems
typing by voice for mobility challenged or to avoid overuse injuries (Dragon Naturally Talking). Also for hearing impaired.: Speech to text
decides whether talker achieved goal or not.: Computer-Based Speech Training Aids
drive system selected: goals and populatione
designed to be used in conjunction with a speech pathologist. â€“ Small vocabulary, speaker dependent â€“ Speech of children with speech delay is too variable for speaker independent. â€“ Needs multiple goals to avoid frustration: explain ISTRA
speaker independent; language specific.: hearsay
Want to understand what speech cues listeners use and whether different groups use them differently. â€“ Understand how language is processed/learned by normal listeners. â€“ Understand differences/disorders. â€“ Create bett: purpose of speech perception experiments
â€“ Type of experiment (identification or discrimination) â€“ Type of stimuli (synthesized or natural).: How to study speech perception?
⬢ Specify what formant frequency values should be (either unchanging or must specify each point in time). ⬢ Source is created, goes through filters, output is a file. Create a new file for each stimulus.: how to use Synthetic Speech for
Research?
⬢ Observed: CV and VC formant transitions vary depending on place of articulation of stops. ⬢ Question: Do listeners use it? ⬢ Stimuli: vary onset of F2 for transition from consonant to vowel (CV) from steeply rising (: explain Consonant place of articulation
experiments
Identification: Play each stimulus: three choices (bae/dae/gae). â€¢ Analyze data: typically people are very sure for most stimuli â€“ high percent identification. 1 or 2 stimuli at 50% â€¢ This means steep identification functio: explain Place of articulation experiments
Category boundary:: where the function of
identification is at 50%.
play pairs of stimuli. Some two steps apart; some exactly the same. Task is to say same or different.: Discrimination
when you get typical results for place articulation experiments: good discrimination only when
two stimuli identified as different phonemes--baffeling result b/c physical differences are equal across all
steps
The combination of steep identification functions AND good discrimination only at category boundaries: categorical perception
â€¢ Create a continuum by varying both F1 and F2 to go from /I/ to /E/ to /ae/. â€“ Same type of id. and discrimination tasks as for consonants, but very different results â€“ identification function NOT steeply sloping â€“ Fairl: explain vocal experiments
way to explain categorical perception: Motor Theory of Speech Perception
we identify phonemes through access to the underlying motor gestures that produced them, not directly through acoustic features (innate and special for humans).: Motor Theory of Speech Perception
invariance exists (just need to do more research) and speech system developed on existing auditory sensitivities.: Alternative Theory: Acoustic Invariance
Support: infants apparently born with categorical perception. ⬢ Support: sort of avoids problem of variability (gestures are consistent?). ⬢ Problem: some non-human animals seem to have categorical perception. ⬢ Altern: alternative to motor theory
steeper slope: stop
gentler slope: vowel
are things that can change alot: dipthongs
perceptions drives production --> phonemes --> word: perceptions drives production -->
uses natural speech synthesis and higher storage space: cognitive synthesis
steeper the slope the: faster articulators moving
computer speech programing: formant synthesis
very unnatural sounding: isolated
alters input: cochlea
doesn't drop dramatically: pitch
relates to pronunciation and how it changes to syllable boundaries: duration
smaller bone in the human body: stapes
fewer HCs designated to higher frequencies than low: explain nonlineary in hearing
f3 and f4 on coch merge together; see f1, f2-f4 together, then f5: what is a big difference b/w coch and spectro?
what is poor for formant synthesis?: fricatives and stops
___ db loss from ossicles missing: 30 db loss
____ db loss of SA from oval window to TM: 25 db loss
____ db loss b/c of stapedius: 5 db loss
tm=.85cm2 ow=.03cm2: surface area of tm and ow
higher sa lower pressure lower sa higher pressure: realtionship b/w sa and p

Start Studying!

Deck Info

Number of cards 159