Psych 120B exam 2 material
Terms
undefined, object
copy deck
- Realism
- The external world exists, or there is a real world to sense.
- Positivism
- All we really have to go on is our senses, so the world could be nothing more than an elaborate hallucination.
- Euclidean
- Real-world geometry
- non-Euclidean
- Non-real-world geometry, such as that within retinal projections
- Why have two eyes?
-
Fundamentally: You can lose one and still be able to see.
Also:
-See more of the world
-binocular vision (depth) - Binocular summation
- An advantage in detecting a stimulus that is afforded by having two eyes.
- Binocular disparity
- The differences between the two retinal images of the same scene.
- Stereopsis
- The ability to use binocular disparity as a cue for depth, and the impression of three-dimensionality--of objects popping out in depth.
- Occlusion
- A cue to relative depth order when, for example, one object obstructs the view of another object.
- Nonmetrical depth cue
- A depth cue that provides information about the depth order (relative depth) but not the depth magnitude (e.g., his nose is in front of his face)
- Metrical depth
- A depth cue that provides quantitative information about distance in the third dimension
- Relative size
- A comparison of size between items without knowing the absolute size of either one
- Texture gradient
- A depth cue based on the geometric fact that items of the same size form smaller images when they are farther away.
- Relative height
-
A depth due based on the observation that objects at different distances from the viewer on the ground plane will form images at different heights in the retinal image.
Objects farther away will be seen as higher in the image - horopter
-
The horopter refers to sets of points in the world having identical binocular disparities.
Objects that fall on the semicircular set of horopter points project images to corresponding retinal points - Crossed disparity
- Crossed disparity indicates that a point is nearer to the observer than the point being fixated.
- Uncrossed disparity
- Uncrossed disparity indicates that a point is farther from the observer than the point being fixated.
- Familiar size
- A depth cue based on knowledge of the typical size of objects (e.g., humans)
- Relative metrical depth cue
- A depth cue that could specify, for example, that object A was twice as far away as object B without providing information about the absolute distance to either A or B
- Absolute metrical depth cue
- A cue to depth that provides absolute information about the distance in the third dimension (e.g., his nose sticks out 4cm in front of his face)
- Aerial perspective
- Also called haze, it is a depth cue that is based on the implicit understanding that light is scattered by the atmosphere
- Gestalt
-
From the German word referring to the "whole."
In perception, the name of the school of thought that stressed that the perceptual whole is greater than the apparent sum of its parts. - Gestalt grouping rules
- A set of rules describing which elements in an image will appear to group together.
- Good Continuation
- Rule stating that two elements will tend to group together if they seem to lie on the same contour.
- Illusory contour
- A contour that is perceived even though nothing changes from one side of the contour to the other in the image.
- Structuralism
- A school of thought that held that complex objects or perceptions could be understood by analysis of the components
- Texture segmentation
- Carving an image into areas of common texture properties
- Similarity (gestalt)
- The tendency of two features to group together will increase as the similarity between them increases.
- Proximity
- The tendency of two features to group together will increase as the distance between them decreases.
- Symmetry
- A rule for figure-ground assignment stating that symmetrical regions are more likely to be seen as figure.
- Parallelism
- A rule for figure-ground assignment stating that parallel contours are more likely to belong to the same figure.
- Common region
- Two features will tend to group together if they appear to be part of the same larger region.
- Connectedness
- Two items will tend to group together if they are connected.
- Common fate
- Two features tend to group together if they are doing the same thing (e.g., moving together)
- Synchrony
- items that change at the same time tend to group together, even if they change in different ways.
- Accidental viewpoint
- A viewing position that produces some regularity in the visual image that is not present in the world.
- Figure-ground assignment
- The process of determining that some regions of an image belong to a foreground object and that other regions are part of the background.
- Surroundness
- A rule for figure-ground assignment stating that if one region is completely surrounded by another, it is likely that the surrounded region is the figure.
- Relatability
- The degree to which two line segments appear to be part of the same contour.
- Heuristic
- A mental shortcut
- Nonaccidental feature
-
A feature of an object that is not dependent on the exact viewing position of the observer.
Such as the Y, T, or arrow junctions present when boxes overlap. - What is an object?
-
Objects are the basic units in our
representations of the world.
Object perception tells where the physical
world breaks apart.
Different from other representations of a scene
Example: Pixel representations in computer graphics and image processing do not "know about⬝ objects.
Object perception tells us how the physical world
breaks apart, how it is organized, and
how it will function.
Vision tells us these things at a distance. - Efficiency of Object Perception
-
It accurately tells how the physical world
breaks apart.
2) It works at a distance.
3) It works fast.
In experiments with rapid serial visual presentation, we can see many objects per second (even at 50 - 100 msec each). - Obstacles / mysteries in object perception
-
1) Segmentation: How is the visible world broken up into
separate objects?
2) Grouping / Unit Formation: How do separate visible
regions get connected into objects?
Problem of occlusion
3) How do we recover 3D shape from particular views?
4) How do we describe and represent shape? - Distinguishing perception and recognition
-
Object perception involves obtaining a description of
shape, size, material composition, etc. from light
information.
Object recognition is the matching of some
description obtained by perception with
something previously stored in memory. - Recognition
-
The previously stored information can be a category,
such as "chair"
⬦or an instance, such as "my favorite lounge chair
with the big cushions."
Most research on object recognition involves categorization
(so-called "basic level" recognition).
All recognition research presupposes some description
obtained from perception. - Object Perception Tasks
-
A. General: perception vs. recognition.
B. Edge detection
C. Edge classification
D. Junction detection and classification
E. Boundary assignment
F. Unit formation
G. Shape perception - Edge Detection
-
Models of edge detection use operators that are applied across large regions of the visual field.
Important edges are given by differences in:
⬢ luminance
⬢ color
⬢ motion
⬢ depth
⬢ texture - Edge Classification
-
Edges come in different flavors.
The visual system wants to know about objects.
Illumination:
Shadows
vs. reflectance edges:
Occluding edges
Surface markings - Junction detection and classification
-
Occluding edges usually have “T†junctions
Transparency signalled by “X†junctions
Object corners have “L†junctions
“Y†Junctions indicate 3D object corners.
Presence or absence of junctions determines
important aspects of visual processing.
Segmentation and grouping-- what gets separated or connected.
"Rounded" junction valleys indicates connectedness - Boundary Assignment
- Figure ground tasks such as the vase-faces illusion
- “transposition"
-
Changing constituent
elements but retaining
same form - Unit Formation / Object Formation
- Also called: Segmentation and Grouping
- Max Wertheimer
-
Wertheimer criticized the current educational emphasis on traditional logic and association, arguing that such problem-solving processes as grouping and reorganization, which dealt with problems as structural wholes, were not recognized in logic but were important techniques in human thinking. Related to this argument was Wertheimer's concept of Pragnanz (“precisionâ€) in organization; when things are grasped as wholes, the minimal amount of energy is exerted in thinking. To Wertheimer, truth was determined by the entire structure of experience rather than by individual sensations or perceptions.
Early 20th century theorists, such as Kurt Koffka, Max Wertheimer, and Wolfgang Köhler (students of Carl Stumpf) saw objects as perceived within an environment according to all of their elements taken together as a global construct. This 'gestalt' or 'whole form' approach sought to define principles of perception -- seemingly innate mental laws which determined the way in which objects were perceived. - Problems in Object Perception: Unit Formation
-
The visual system connects spatially separated visible
areas using two processes:
contour interpolation
-Illusory Contours
-Occluded Contours
-Transparency
-Self-splitting objects
surface interpolation - Contour Interpolation
-
The process of contour interpolation follows a definite
geometry, involving particular image features and relations
1) The process begins with the locating of contour junctions.
2) Interpolated edges begin and end at these junctions.
3) Contour interpolation follows a smoothness constraint, known as contour relatability.
4) Relatability is related to the Gestalt idea of good continuation. - contour interpolation phenomena
-
A number of contour interpolation phenomena depend on a common process
-The Petter Effect
-Hybrid occluded/illusory contours - Surface Interpolation
-
Contour Interpolation Operates Despite Some Differences in Surface Color
Surface Interpolation Depends on
Surface Color Relations
1) The contour interpolation process depends on
oriented edges leading into junctions.
2) The surface process complements contour
interpolation.
3) Surface properties “spread†under occlusion within
real and interpolated boundaries.
4) This process depends crucially on matches of
color, lightness, and texture. - Shape
-
Shape is a relational notion
-relations between visible or imagined parts
Shape cannot be gotten from the “sum of the partsâ€
Shape shows scale invariance and orientational invariance
Shape representations are mysterious
--cannot be simply the collection of local oriented units
Contour shapes switch boundary assignment all
at once -
Shape representations: decomposition into
smaller units -
Volumetric primitives -- “geonsâ€
--works best for artifacts (human-made objects) - Modal and Amodal completion
-
Amodal completion:
A form of visual completion that occurs when portions
of an object are hidden behind another object—but the former object
is nevertheless perceived to be a single continuous entity.
This is known as amodal completion because, despite the vivid percept
of object unity, observers do not actually see a contour (i.e., a
contrast border) in image regions where the completion occurs
Modal Completion:
A second form of completion that occurs when portions of an object are camouflaged by an underlying
surface—because this underlying surface happens to project the same
luminance and color as the nearer object. This form of
completion is known as modal completion because observers perceive
a contrast border—an illusory contour—in image regions that contain
no contrast (thus, an observer’s percept has the same ‘‘mode’’ as if a
contour were actually present). - Importance of Spatial Perception
-
Mobile organisms need to:
-- know where things are.
-- know whether locomotion is safe.
-- guide action appropriately in their spatial environments.
These tasks depend on comprehending and representing three-dimensional (3D) space.
Different dimensions present different challenges perceptually. - SIZE CONSTANCY
-
As a person walks away, their image size on our retinas
shrinks. Yet we do not see the person as getting smaller. - Depth and Distance
-
Depth : Relative position from observer (nearer/farther)
Distance: Absolute position given using some kind of metric
or scale.
-Not necessarily a scale like feet or inches
-Perception may be "body-scaled"
(arm's length, step size, etc.) - Sources of Information for Depth and Distance
-
Kinematic Information [motion]
Stereoscopic Information
Oculomotor Information
Pictorial Information [monocular] - Kinematic Information
-
Definition: Kinematic means relating to motion.
Motion perspective: When the observer moves, displacement of an object's image on the eye depends on its distance.
When the whole visual field is considered, this information is
often called “optic flow.â€
Optical expansion/contraction: When an object approaches, its image expands. If it is on a "hit" path, the expansion is symmetric.
Accretion/deletion of texture: When a surface moves relative to
another, the nearer surface progressively occludes background
texture on the further surface. - Simple algorithm for Kinematic edge detection
-
Get new value at each
point by subtracting
value there from average
of its neighbors.
This computation produces a map of significant object and
surface edges in visual field. Edges are marked by
non-zero values. - Stereoscopic Information
-
Definition: Stereoscopic means using the two eyes together.
Binocular disparity refers to differences in the two eyes' views of an object.
The direction and amount of binocular disparity depend on the distance of an object from the observer.
The two retinal images of a three-dimensional world are not the same
The horopter refers to sets of points in the world having identical binocular disparities.
Crossed disparity indicates that a point is nearer to the observer than the point being fixated.
Uncrossed disparity indicates that a point is farther from the observer than the point being fixated. - Foveating
- Focusing on an object
- Stereoscopes and stereograms
- Use binocular disparity to create a perception of depth (steropis)
- Oculomotor Information
-
Definition: Oculomotor means having to do with eye muscles
Accommodation refers to changes in the shape of the lens to achieve focused images at varying distances.
Accommodation may provide distance information via unconscious sensing of the muscular movements (in the ciliary muscles) that produce the lens changes.
Convergence refers to the turning of the two eyes to get a particular point in the center of fixation (fovea) of each eye.
Convergence provides depth information via unconscious sensing of the muscular movements used to turn the eyes.
Accommodation and convergence are not very useful beyond about 2 meters of distance. Sometimes this is called "near space.⬝
These cues potentially provide absolute distance information. - Pictorial Information
-
Definition: Pictorial refers to depth cues that can operate in flat
pictures. They are all also monocular cues, in that they can
operate (usually better) when you view with only one eye.
Some pictorial cues were discovered by artists.
Most pictorial cues relate to rules of optics and geometry that govern the projection of the world onto the retina.
Use of pictorial cues for depth perception involves using the rules of projection in reverse.
Laws of Optics: Scene -> Retina
Inverse Optics: Retina -> Scene -
Pictorial Information
ASSUMED PHYSICAL EQUALITY -
Many pictorial cues have a common theoretical basis.
The visual system operates as if it assumes that things whose
projections to the retina are different are actually
similar in the world.
The differences in the retinal projections is taken to be
caused by differences in depth position in the world. - Parallel lines
- Parallel lines in the image plane, (a) remain parallel; in other planes they (b) converge
- Monocular depth cues
-
Parallel Lines
Texture Gradients
-texture patterns that shrink into the distance
Familiar size
Relative size (in relation to objects of similar/known size)
-Relative size is more effective when size changes systematically
Relative height
Aerial perspective
Occlusion makes it easy to infer relative position in depth -
Multiple Sources of Information:
Why? -
"God must have loved depth cues, for she made so many of them.⬝ -- A. Yonas
Some provide absolute information about absolute position, whereas others provide information about the relations of objects and surfaces. (Distance vs. Depth)
Different sources of information have different operating conditions.
Some evidence suggests that the system relies on the cues that provide the best evidence in general or under specific conditions. - Ecological Validity
-
Ecological validity refers to how accurately a cue specifies some situation in the environment.
Roughly speaking, one can get at ecological validity of
depth cues by considering how hard it would be to
arrange a situation that depicts depth according to the cue, but does not really have depth in the world.
Example: A TV show depicts 3-D environments, but the screen is actually flat. - Ecological Validity - Highest and Lowest
- Of the 4 categories of depth / distance information, stereoscopic and kinematic have highest ecological validity and pictorial has the weakest.
- Correspondence Problem
- For stereo vision, our brain must match points on one retina to points on the other retina. This is known as the correspondence problem.
- Correspondence Problem-Potential Solutions
-
align low-frequency information first
simplifies the problem
now we match hundreds of big blurry dots
instead of thousands of small sharp dots
uniqueness constraint
a feature in the world appears exactly once on each retina
continuity constraint
neighboring points lie at similar distances (except at object boundaries) - Prior knowledge and assumptions
-
We assume that the pennies are the same size
We assume that both pennies are circular
We assume that occlusion is more likely to produce the image than an accidental alignment
These assumptions allow us to exclude unlikely interpretations of the world - Bayesian Approach
-
Bayes' theorem provides a rigorous mathematical method to integrate prior knowledge with the input to make an inference about the world
P(scene|image) ≈ P(image|scene) x P(scene)
P(scene|image) = likelihood of scene given the image as input
P(image|scene) = likelihood that the scene would produce the image
P(scene) = likelihood of the scene - Size Perception: The Task and the Information
-
Size of the retinal projection is an unreliable
indicator of real size.
There is a lawful relation between:
Real size (S), viewing distance (D), and projective size (s):
S / D = s / d
where d is a constant representing the depth of the eyeball.
The visual system can essentially solve
for S, the real size, by:
S = D s / d
In this equation, s is given on the retina, and d is a constant
assumed to be known by the system.
D is gotten through distance perception, as we have discussed.
From these three inputs, perceived size S can be computed. - Size Perception: Brain Location
-
No good current understanding of where the equation gets solved --
Some evidence for dorsal stream - Holway & Boring (1941)
-
Question - does depth information contribute to the perception of size?
Idea – remove depth information. Will the size perception change?
task: adjust the size of the comparison circle to match the size of the test circle
the test circle always covered the same visual angle, but was presented at different sizes and distances. - Holway & Boring (1941): Depth manipulations
-
removed depth cues in some conditions
full cue
monocular – eliminates Stereoscopic Information
viewed through a hole – eliminates Binocular Vision (and maybe kinematic)
cover surfaces with draperies – eliminates ? - Holway & Boring (1941): Conclusions
-
As we remove depth information, the perception of size changes! Therefore, size perception depends on depth perception
Without depth information, the observers increasingly relied on visual angle
direct vs. indirect perception - Motion perception: Intro
-
Motion perception is incredibly sensitive and accurate
Example: Returning a 120 mph tennis serve
Ball moves 176 ft. / sec
Almost 2 feet in 1/100 sec
Need to contact ball in right place with about 3 sq. in. area of racket - Motion is closely related to perception⬦
-
Complex perceptual systems are found only in organisms that guide their own motion in space
Perception and self-motion probably co-evolved
Two equally important aspects of visual motion perception are:
Seeing moving objects
Seeing and guiding one's own motion - Sensitivity to Motion
-
We can characterize motion sensitivity in various ways:
Slowest perceptible motion
Fastest perceptible motion
Sensitivity in central and peripheral viewing
Important in driving - Characterizing slowest perceptible motion
-
Find a velocity threshold
Could define threshold as the velocity at which motion is detected 50% of the time
Example
T-rex wants to eat those annoying kids from Jurassic Park (I would too)
If they remain still, he won’t see them, but if they move faster than his velocity threshold… kiddie toast
How exactly do we quantify T-Rex’s velocity threshold?
Retinal velocity
Change in visual angle on the retina per unit time
Confusing terminology: visual angle typically measured in “minutes†- Sensitivity to Motion: Scenarios
-
Scenario 1: T-Rex searches for the kid brother in an “empty field†situation
Empty field: no background references
Subject-relative motion: motion of a single visible object with no background references
Kid brother would have to move at ~10-20 min/sec
Scenario 2: T-Rex searches for the older sister in a crowded kitchen
Object-relative motion: motion of a visible object relative to some other object or visible background
Older sister would have to move at ~1-2 min/sec
Velocity thresholds: subject vs object relative motion
Subject-relative motion thresholds are about 10 times as high as object-relative motion
Bottom line: motion is best seen against a background - Information for Motion Perception
-
There are multiple sources of information or multiple situations that lead to perceived motion
RETINAL DISPLACEMENT is the changing position of an object's image on your eye
OPTICAL PURSUIT occurs when you track a moving object with your eye; the image stays on the fovea, yet you perceive it as moving - Real Motion
-
Refers to situations in the world in which an object is actually moving
Real motion can produce either retinal displacement OR optical pursuit - Apparent Motion
-
Occurs when images flash on and off in separate locations with certain timing relations
Although nothing really moves between flash locations, motion is seen
Also called stroboscopic motion
Wertheimer performed a famous experiment investigating the timing relations in apparent motion
Flashed a spot of light at location X, waited, then flashed another spot of light at location Y
Varied the time interval between the two flashes of light to see what effect this had on our perception of motion - Apparent Motion: Timing
-
Interstimulus Interval (ISI): time between the end of one flash and the start of another
When ISI < 60 ms: SIMULTANEITY, not movement, is seen
When ISI is 60-200 ms: OPTIMAL MOVEMENT is seen
Movement appears smooth and continuous
When ISI > 200: SUCCESSION, not movement, is seen - Induced Motion
-
Involves an object and a surrounding reference frame
When the surround or frame moves, the object appears to move
May also make the observer feel like he or she is moving - General Theories of Motion Perception
-
Indirect Perception Theory
Motion is not a basic perceptual quality; it is derived from other things
Direct Perception Theory
Motion is a basic perceptual quality; your system is wired to perceive it
Exner and Wertheimer each did experiments offering support for the direct view
-DIRECT Perception Theory is correct - Exner’s Experiment
-
Looked at threshold for perceived succession
Below 45 ms, can’t judge two events in succession
BUT: Can see one flash moving as low as 14 ms - Wertheimers’s Experiment: “Phi†motion
-
At ISIs around 60 ms, one sees simultaneous lights but also sees something moving between them
Two simultaneous percepts
Off and on in fixed location (Simultaneity) + Motion
"Objectless" motion
Indicates that motion perception mechanism is triggered independently of the perception of a single object from the two flashes - What neural mechanisms allow us to perceive motion?
-
Reichardt Detectors
Basic model for motion circuits
What are basic requirements for velocity detection?
What algorithm could work?
Register: CHANGE AT LOCATION 1 (which has a delay between it and the location 2 neuron)
⬦. Elapsed time ⬦
Register: CHANGE AT LOCATION 2
If both fire at the same time motion neuron fires
-Directional and specific speed dependent - Reichardt Detectors
-
Arranged to detect in all different directions
Evidence suggests opponent process arrangement to determine net motion
Opponent process⬦ think color theory (ie, blue vs yellow)
With respect to motion, you could have, for example:
Up vs down
Left vs right
Can explain movement after-effects
Resulting motion perception is the vector sum
of opposite direction detectors
1)Perceiving motion in one direction fatigues
detectors for that direction
2)Afterwards, looking at a stationary scene,there is a reduced response from fatigued detectors - The waterfall illusion
-
For a stationary scene, there is a base level of
response from each of the opposing detectors
---> <---
After exposure to a certain direction of motion, this base level of response is reduced for a fatigued detector
->
Combining their responses shows a net motion signal
in the direction opposite to the original adapting motion
-> + <--- = <-- - Perception of a Stable World
-
From the standpoint of basic motion detectors (ie, Reichardt detectors) object motion and observer motion can have the same effects
Two main theories:
The COROLLARY DISCHARGE theory
The RELATIONAL VISUAL INFORMATION theory - Corollary Discharge Theory
-
Movement perception depends on three types of signals:
Motor Signal (MS)
Sent to the eye muscles when observer moves eyes
Similar to optical pursuit
Corollary Discharge Signal (CDS)
Copy of the motor signal sent to the comparator
Image Movement Signal (IMS)
Occurs when an image stimulates receptors as it moves across the retina
Similar to retinal displacement
Signal sent to the comparator.
Perception of movement is determined by whether the CDS, IMS, or both reach a structure called the comparator
Movement is perceived when the comparator receives either the CDS or IMS individually
No movement is perceived when the comparator receives both signals at once
When these signals reach the comparator simultaneously, they cancel each other out - Relational Visual Information Theory
-
System uses information about relations between objects
-Not concerned with eye movement
Optic array: structure created by the surfaces, textures, and contours of the environment
Non-technical definition⬦
Whatever scene you happen to be looking at
Local disturbances in the optic array indicate object movement
Global disturbances in the optic array indicate observer movement, and thus a stable world - Local disturbances in the optic array
-
Occur when one object moves relative to the environment, covering and uncovering the background
Remember accretion/deletion?
Happen whether we are tracking a moving object, or whether our eyes are stationary - Global disturbances in the optic array
- Occur when all the elements of the optic array move
- Event perception
-
Perception of what
things are moving, how they are
interacting, and where they are
Example of “higher-order†perception:
Perception is not just about basic
sensory dimensions, such as color,
loudness, etc.
Perception is about higher-order
relationships, such as causality. - Dual Specification
-
Refers to the fact that
events provide information for both changes
in the world AND about persisting properties
of the world
--phrase coined by J.J. Gibson
Examples: object form
spatial layout - Motion Information for Form
-
A. Motion alone can indicate the 3-D form of objects.
B. Originally, this was called the kinetic depth effect.
C. More generally, these abilities are called structure-from-motion (SFM)
-- referring to the extraction of object structure
from information in moving displays - Motion Information for Depth
-
Recall some kinematic depth cues:
motion perspective
accretion and deletion of texture
optical expansion - Structure From Motion
-
Moving dots are often used to study SFM.
-- When stationary dots do not reveal the contours or surface information of objects.
-- Thus, they can be used to study the motion effects independent of surface properties. - Rigid motion
-
Rigid motion is the geometric term describing motion of an object in space during which there are not changes in the distances between any two points on the object.
In other words, the object does not bend or deform during motion. - Mathematical Analyses of SFM
-
Focus on Rigid Motion
-In other words, the object does not bend or deform during motion.
The Problem of Determining SFM
The problem of SFM in rigid motion is for the observer to recover 3-D form from the changing 2-D projection of an object.
A number of theorems have been proven showing that from a small number of points (3-5) moving in the 2-D image, the 3-D structure of a rigid object can be recovered.
Some have argued that the visual system will fit a rigid solution to motion displays whenever this is possible. - Non-Rigid Motion
-
Examples:
Your hand (jointed motion)
Jellyfish (elastic motion)
Much research has looked at point-light walker
displays, in which perceivers see a person
walking from motion of only a few points of light.
(Johansson)
Sometimes this is called biological motion. -
Non-rigid Motion
(Mathematical analysis of) -
What constraints are there?
What distinguishes unified, non-rigid
motion from disconnected motion
of dots?
These are unsolved problems! - Sound as Information
-
Sound provides many different kinds of information
about our environment.
SPATIAL LOCATION INFORMATION
HEARING IS ALSO SPECIALIZED FOR
PROCESSING OF SPEECH
MUSIC: Another gift from the sense of hearing - Sound as Information: SPATIAL LOCATION
-
It tells us directions of events.
It can tell us distance to a source.
-Like vision, hearing is a “distance senseâ€
Monitoring the environment:
Vision is better for spatial detail, but…
Hearing is special in allowing us to monitor
our surrounds without special orienting - Sound as Information: INFORMATION ABOUT SUBSTANCE AND EVENTS
-
Sound carries rich information about what things are
made of and about events occurring
Much of the information about substance and events
(for example, what distinguishes metallic
sounds from wooden sounds) remains
poorly understood. -
Puzzling questions:
What are the biological functions of musical perception
and the aesthetics of music? -
The enjoyment of music, and our abilities to process it, are
among a number of aspects of human psychology and motivation that are difficult to connect to standard biological accounts of the functions of behavior and cognition (e.g., survival, reproduction). - The Physical Stimulus for Sound
-
Physically, sound is the compression and spreading apart of air molecules.
1) This compression and rarefaction (spreading) is caused by any movement or vibrational event in a medium, such as air or water.
2) The disturbance in the medium spreads as a wave outward from the disturbance. -
The Physical Stimulus for Sound
What moves? -
Individual molecules do not move very far.
It is the wave that propagates. -
The Physical Stimulus for Sound
The Speed of Sound ⬦ -
is specific to a medium.
In air, it is about 330 m/sec (or 1100 feet/sec) - The Mathematical Description of Sound
-
All sounds can be described as combinations of simple sinusoidal waves.
This is true because of a theorem proved by the mathematician Fourier. Put simply, any complex waveform can be decomposed into a combination of simple sine waves, each having a specific frequency and amplitude.
The decomposition is unique (there is only one way to do it).
The decomposition has the interesting property that all components will have frequencies that are integer multiples of the lowest one. -
The Mathematical Description of Sound
Basic Unit -
Basic Unit: the Sine Wave
-Frequency
-Amplitude
-Phase - Simple and complex sounds
-
Sine waves: Not common everyday sounds because not many vibrations in the world are so pure
Most sounds in world: Complex sounds, (e.g., human voices, birds, cars, etc.)
All sound waves can be described as some combination of sine waves
Complex sounds can be described by Fourier analysis
A mathematical theorem by which any sound can be divided into a set of sine waves. Combining these sine waves will reproduce the original sound - ossicles
-
Amplification provided by ossicles is essential to ability to hear faint sounds
Inner ear is made up of collection of fluid-filled chambers - Middle ear muscles
-
Middle ear: Two muscles-tensor tympani and stapedius
Purpose: To tense when sounds are very loud, muffling pressure changes
However, acoustic reflex follows onset of loud sounds by about one-fifth of second, so cannot protect against abrupt sounds, (e.g., gun shot) - Inner ear
-
Inner ear: Fine changes in sound pressure are translated into neural signals
Function is roughly analogous to that of retina - Cochlear canals and membranes
-
Cochlea: Spiral structure of the inner ear containing the organ of Corti
Cochlea is filled with watery fluids in three parallel canals - Organ of Corti
-
Movements of cochlear partition are translated into neural signals by structures in the organ of Corti; extends along top of basilar membrane
Made up of specialized neurons called hair cells, dendrites of auditory nerve fibers that terminate at base of hair cells, and scaffold of supporting cells - translating sound waves
- Firing of auditory nerve fibers into patterns of neural activity finally completes process of translating sound waves into patterns of neural activity
- Coding of amplitude and frequency in the cochlea
- Place code: Tuning of different parts of cochlea to different frequencies, in which information about the particular frequency of incoming sound wave is coded by place along cochlear partition with greatest mechanical displacement
- Inner and outer hair cells
-
Inner hair cells: Convey almost all information about sound waves to brain
Outer hair cells: Convey information from brain (use of efferent fibers). They are involved in elaborate feedback system - Psychological Dimensions of Hearing
-
The psychological experience of loudness is related to the physical variable of sound pressure
Pitch (psychological) is related to sound frequency (physical).
Human sensitivity to frequency ranges from about
20 - 20,000 Hz (where one Hz = one cycle per second of vibration)
-- Children have greatest range;
there is some loss at the very high end with age.
-- There is substantial loss in the high frequencies
for people who are exposed to loud noises. - Auditory Space Perception: General
-
The auditory system helps us to perceive spatial locations of events, especially their direction in space from us.
We can also perceive the distances of sounds to
some degree.
There are multiple ways in which direction is computed by auditory processes.
-- These relate to certain differences among sounds and situations, allowing us to perceive direction well through the combination of processes. - Interaural time difference (ITD)
-
The difference in time between a sound arriving at one ear versus the other
-Differences in the arrival at the two ears of an ONSET of a sound can be used to localize it.
A head's "sound shadow" contributes to differences in ITD, and/or blocks energy in frequencies 1000hz+ - Auditory Space Perception: dependency
-
Most auditory space information depends on the fact we have two ears in different positions.
As in stereoscopic depth in vision, the two ears work
as a system. - Azimuth
-
Used to describe locations on imaginary circle that extends around us, in a horizontal plane
Can analyze ITD:
Where would a sound source need to be located to produce maximum possible ITD? Directly to the side
What location would lead to minimum possible ITD? front and back
What would happen at intermediate locations? - Auditory Space Perception: Information from differences
-
Differences in the arrival at the two ears of an ONSET of a sound can be used to localize it.
Differences in the PHASE of soundwaves arriving at the two ears can be used for localization.
Differences in the INTENSITY of soundwaves arriving at the two ears can be used for localization. - Interaural level difference (ILD)
-
The difference in level (intensity) between a sound
arriving at one ear versus the other
Sounds are more intense at the ear closer to sound source
ILD is largest at 90 degrees and –90 degrees, nonexistent for 0 degrees and 180 degrees
ILD generally correlates with angle of sound source, but correlation is not quite as great as it is with ITDs - PHASE and INTENSITY differences
-
PHASE and INTENSITY differences play a complementary role.
1) Phase differences work for low frequency sounds.
2) Intensity differences work for high frequency sounds.
3) The combination allows us to localize most sounds. - CONE OF CONFUSION
-
The time and intensity differences we have described so far are subject to the
CONE OF CONFUSION
The cone of confusion refers to a set of points in space that produce identical onset, phase or intensity differences, due to symmetries of being in front / behind the head, or
above / below the head. (e.g., 45deg right/front and right/back)
HEAD MOVEMENTS can be used to resolve the cone of confusion. - CONE OF CONFUSION: Pinna
-
The shape of the PINNAE also helps resolve ambiguities in sound location, primarily for high-frequency sounds.
The pinna is the outer ear. - Auditory Distance Perception
-
How do listeners know how far a sound is?
Simplest cue: Relative intensity of sound
Inverse-square law: As distance from a source increases, intensity decreases faster such that decrease in intensity is distance squared
Spectral composition of sounds: Higher frequencies decrease in energy more than lower frequencies as sound waves travel from source to one ear
Relative amounts of direct vs. reverberant (reflected sound) energy