This site is 100% ad supported. Please add an exception to adblock for this site.

Learning theory for animal training (Work In Progress!)

Terms

undefined, object
copy deck
CC
Classical Conditionin
OC
Operant Conditioning
Learning/Performance distinction
The distinction between knowing and doing.

Without motivation, an animal may not perform behavior it knows.

A mouse that is not hungry won't run a maze for cheese.
Behavior vs Knowledge
You are dealing with BEHAVIOR and infer knowledge
Motivation
Forces which act on or within animal to activate and direct behavior.
Four stages of learning
Acquisition

Fluency (automatic!)

Generalization (application)

Maintenance (forever)
Acquisition
First step of learning.

Animal learns a basic skill.

Animal must learn what is expected.

Trainer focuses on accuracy.
Fluency
Second step of learning.

With repetition, behavior becomes fluid, automatic.

Trainer can focus on enhancing response
Generalization
Animal learns that what is being learned is relevant in various contexts.

Generalization is rarely automatic.
Maintenance
Fourth stage of learning.

Repetition.

Forever.

Periodic reinforcement required to preserve proficiency and prevent
extinction.
Ethology
The study of animal behavior
Clever Hans
A horse who appeared to be able to do math by tapping foot.

Horse could only solve problem trainer could!

Trainer was raising eyebrows.

Pfungst discovered that people can unconsciously communicate information to others by subtle movements and that some animals can perceive these unconscious movements.
Parsimony
Unless there is evidence to the contrary, you must account for a phenomenon with the simplest explanation available.
Occam's Razor
The principle states that one should not make more assumptions than the minimum needed.

See Parsimony
Conditioning
Learning
Behavior/Response
Any action that can be observed and measured.

Barking, Sitting, Lying Down, etc.
Stimulus
Any event that can be percieved by the animal.
Consequence
An action or event that occurs AFTER a behavior.

A consequence can affect how often a behavior will occur in future.
Contingency
Depends on

If XXX then YYY

If my dog lies down, I'll throw the frisbee.

Throwing the frisbee is contingent upon lying down.

If LIE DOWN then FRISBEE

FRISBEE depends on LYING DOWN
Performance
Actual Behavior.

What you see is what you get.

Performance does NOT imply learning
Does performance imply learning?
NO
Appetitive/Positive
Good things.

Food
Sound
Play
Feelings
Dead Meat
Prey Drive
Aversive/Negative
Bad things

Pain
Annoyance
Uncomfortable
Loud
Leash jerk
Shock
Quick Movement
Theory
An explanation
Principles
Principles are the rules outlined by a theory.
Learning Principles
The rules or laws governing learning
Classical Conditioning
(Defined)
Associative learning

Pavlov

Animal learns to associate stimulus with a response.

See a FOO get a cookie.

Receiving a cookie is NOT contingent upon seeing a FOO
Classical Conditioning
(Notation)
CS(TONE) -> UCS(FOOD) -> UCR (SALIVA)

becomes

CS (TONE) -> CR(DROOL)
UCS(FOOD -> UCR
CS
Conditioned Stimulus

This is the stimulus that brings on a particular response after being paired with an unconditioned stimulus. The flashing light was this role in the experiment. It had an important effect on the dog's behaviour but only under a specific condition, it had been paired temporarily with the tasting of food.
UCS
Un Conditions Stimulus

This is a stimulus that automatically elicits an unconditional response. Pavlov's experiment had food as an unconditional stimulus.
UCR
Un Conditioned Response

It is the automatic response to an unconditional stimulus. An example of this is the automatic salivation of the dog in response to the food
CR
Conditioned Response

This refers to a response that the conditioned stimulus elicits, but only because it has previously been paired with the unconditioned stimulus. An example of this was the salivation of the dog in response to the light, this is the conditioned response.
CS presentation
BEFORE the UCS
In classical conditioning, is CR required when the UCS is presented?
No.
Train a dog to sneeze, growl, snarl or dig
Use Classical Conditioning:

Find something that makes behavior occur and precede it with a neutral stimulus
Neutral Stimulus
Sometimes called orienting stimulus :

does not elicit the response of interest: this stimulus is a neutral stimulus since it does not elicit the Unconditioned (or reflexive) Response.
Orienting Stimulus
Dog pays attention when presented, but does not yet mean anything.

Sometimes called Neutral Stimulus
Operant Conditioning
Instrumental Learning

An animal learns that behavior has consequences.

Things happen because we do things.


If you cook you eat.

Rolling over feels good.

Sitting when trainer says 'sit' gets a cookie.

Turn on the cookie machine.
Operant Conditioning
(notation)
Sd -> R -> S
Sd
Discriminating Stimulus

The context in which a response will grant a consequence.

Aside from an S(d), the events that occur are under the animal's control.
Thorndike Law of Effect
If a consequence is pleasant the preceding behavior becomes more likely. If a consequence is unpleasant, the preceding behavior becomes less likely.
A-B-C
Antecedent
Behavior
Consequence
Positive Reinforcement
R+

Present Something Good
Behavior More Likely
Positive Punishment
P+

Present Something Bad
Behavior Less Likely
Negative Punishment
P-

Take Away Something Good
Behavior is Less Likely
Negative Reinforcement
R-

Take Away Something Bad
Behavior is More Likely
Four consequences
(the final S in Sd -> R -> S)
R+ Positive Reinforcement
P+ Positive Punishment
R- Negative Reinforcement
P- Negative Punishment
Positive Reinforcement
(Example)
I get a flashcard right, I get a raisin.
Negative Reinforcement
(Example)
Do homework to avoid nagging.

Work late to avoid housework.

Dog heels to avoid yanking
Positive Punishment
(Example)
Pee on floor get hit

Drink and drive go to jail

Stop paying attention and dog bites
Negative Punishment
(Example)
Time out. (TO)

Dog plays rough. Play stops.

Drink/Drive. Lose License.
TO
Time Out
Reinforcement makes behavior (more or less) likely.
More
Punishment makes behavior (more or less) likely.
Less
Distinction between classical and operant behavior
Classical: UCS presented regardless of what animal does.

Operant: Some behavior or response is required for consequence.
Pizza example, CC vs OC
No matter how much money you get paid to NOT eat a pizza, you will not be able to stop drooling when you see and smell the pizza.

CC UCS's are involuntary or reflexive.
When CC are at odds with OC, who wins?
CC: Misbehavior of Organisms.

Reflexive behavior will get in the way of learning and what is intended as OC may elicit reflexive responses.

A squirrel or barking may prevent your dog from performing well conditioned behaviors on cue.
Habituation
Learning not to react to stimuli
Sensitization
Becoming more sensitive to stimuli, especially with emotional reactions
Habituation: Weak vs Intense stimulus
Weak stimulus best for habituation.

Usually.
Sensitization: Weak vs Intense stimulus
Intense Stimuli leads to sensitization.

Usually.
Adaptation
Similar to habituaion.

BUT - adaptation is physical process of tiring.

Scent, Visual.
Learned Irrelevance
When a stimulus is presented without consequence the behavior won't happen.

A dog will learn to ignor things that are of no importance, and attend to things that are.

Sit. Sit. Sit. Sit. Sit. becomes white noise.

May persist forever!
Spontaneous Recovery
When a previously habituated stimulus again causes a reaction (doorbell)
Does habituation have spontantous recovery?
Yes
Does Learn Irrelvance have spontanesou recovery
No
Factors impacting learning
Deprivation Level

Reward

Contrast Effect

Jackpots

Reinforcer Sampling
Deprivation Level
A reinforcer is likely to be more effective if the dog 'needs' it.

Attention

Food

Water

Play
Contrast Effect
A better reward may increase learning. (Kibble to Liver)

A lesser reward may decrease learning. (Liver to Kibble)
Quantity vs Reward Size
More smaller treats more effective.

A dog can count but he can't weigh!
Using high value rewards always can impact traing how?
A mouse that always gets cheese will run mazes slower than one who gets kibble and random cheese awards.

A mouse accustomed to kibble who gets cheese will run maze faster.
Positive Behavior Contrast
Getting a great reward will improve behavior.
Negative Behavior Contrast
Getting a lesser reward will reduce behavior.
Jackpot
Reward for excellence.

NOT a noncontingent reward to get motivation
Reinforcement sampling
Let animal know what's coming.

A reason to perform well.

This is what you'll get when you eat your veggies.
Grandma's rule
PREMACK

Eat your veggies before dessert and finish your homework before moving on to the fun stuff
Jumpstart
Reward to motivate.

Contrast with Jackpot, reward for excellence.
Novelty
If a very familiar stimulus is used as the CS, the animal will learn much more slowly than if a novel stimulus is used.

Kibble sucks.
CS-Prexposure effect
Learned Irrelevance

If an animal has already been exposed to a stimulus and its has not been paired with anything meaningful, it becomes meaningless.
Timing - Classical Conditioning

Inter-stimulus Interval
CS -> UCS

CS MUST appear before UCS for learning to occur.
Timing - Operant Conditioning
Time between R -> S

Must be less than a second!

Primary/Secondary reforcer

(Food/Click) can help make this less critical by presenting the Sr
Primary Reinforcer
(Examples)
Food
Water
Touch
Play
Drive
Secondary Reinforcer
(Examples)
Click
Yes!
Good!
Etc
Primary Reinforcer
Something an animal intrinsically likes.

Food, water, hugs.
Secondary Reinforcer
Something that is meaningless to the dog that has become associated with a primary reinforcer and thus important.
Establishing a Secondary Reinforcer
Repeat the following until dog turns head every time:
1). Click
2). Reward

Do NOT require any behavior beyond head turn/orientation
Prey drive sequence
9 Steps
Orient
Stare
Stalk
Chase
Grab
Bite
Kill
Dissect
Consume
CRF
Continuous Reinforcement Schedule

Reinforce every trial

Best for new behavior
PRF
Partial (Intermittent) Reinforcement Schedule

Behavior is reinforced after certain responses
Intermittend Reinforcement Schedule (PRF)

Examples
Fixed Ratio : FR
Variable Ratio: VR
Random Ratio: RR
Fixed Interval: FI
Variable Interval: VI
Differential Reinforcement Schedule
Certain rates of responding or certain types of responding are reinforced.

Differential Rate: Depends on how long after the preceding response.

DRH - Differential Reinforcement of high rates of behavior.

DRL - Differential Reinfocement of Low Rates of Behavior
DRH
Differential Reinforcement of High Rates of Reinforcement.

Animal is only reinforced if it responds BEFORE a certain interval.

NOT USEFUL
DRL
Differential reinforcement of low rates of behavior

Animal is only reinforced if it responds AFTER a certain interval.

NOT USEFUL
RR
Random Ratio

Increased drive perhaps due to frustration at expected reward.

Must be truly random.
Free Operant Behavior
Behavior that is not prompted but is rewarded.

Eye Contact during heeling.
FR
Fixed Reinforcement

Reinforce every N times

FR-5,for example.

Very high rate of performance except RIGHT AFTER receiving

post reinforcement pause/scallop
post reinforcement pause
After receiving reinforcement an animal may decrease performance a bit.
VR
Variable Rate of Reinforcement.

VR-5 means an average of one in five times gets reinforced.

Very effective.

Low post-reinforcement.

Slot Machine.

Sales Commission
Slot Machine
Variable Reinforcement Schedule
Ratio Strain
On a variable reinforcement schedule, when an animal starts to shut down if not reinforced often enough.
FI
Fixed Interval Schedule

FI-5 reinforced for first response AFTER five seconds.
VI
Variable Interval

VI-5 - On average, response will be rewarded ???
Limited Hold
The time interval that a reinforcement is available.

Example, you have to eat lunch at the cafeteria from 12-1.
DRI
Differential Response of Incompatible Behavior

Response type schedule

Reward only incompatible behaviors.

EG: Reward sitting not jumping
DRO
Differential rewarding of Other behavior

Reward ANY other behavior.

EG: Reward anything other than barking or lunging.
DRE
Differential Reinforcement of EXCELLENT behaviors.

Reinforce only the best.

Use during maintenance
Duration Schedule
Watch Me
Down Stay

Dog reinforced during after a specified interval.

Fixed Interval

Random Interval
Teaching Stay or Wait, what is best schedule to use
Slowly raise criteria on a duration schedule.
Best reinforcement schedule to use for basic behaviors
Start with CRF
Move to VR or RR
Best reinforcement for complex behaviors
DRE

DOGS VARY!
Best reinforcement schedule for problem behaviors
DRL, DRO, DRI

DOGS VARY!
Premack
Grandma's rule

The opportunity to engage in some activities may be reinforcing for others.

Juno likes to sit in a chair during ralley obedience.
Best reinforcement scheudle for classical conditioning is?
CRF
Stimulus Control
Generalization

Discrimination
Overshadowed
A more salient stimulus (squirrel) may overshadow

Less salient (hotdog)

or even less salient(pat)
Prevent blocking
By not presenting cues at the same time.

Present new cues FIRST, then old cues.
Say the command once (because...)
Everytime the dog hears sit and doesn't get rewarded, it degrades the significance of the Sd to the Sr+
Preparedness
The tendency to associate certain types of stimuli more readily than others.

E.G. - Sound to Pain, Food to Illness

High pitch sounds to fast motion,

Low pitch sounds to slow motion
Learning Sets
When a dog learns the rules of the game.

For example, learning to match things that they just saw together vs things they did not???
Experimental Neurosis
Asking a dog to do incompatible things. Can induce real problems.

The gaurd dog asked to stop attacking on hand raise sees theif raise chair and shuts down.
Extinction
Learning that a CS does not result in an UCS. Responding declines.
Extinction Burst
Increase in response, frustration as stimulus no longer produces response.
Spontanous Recovery
Recovery of a behavior after it has become 'extinct' .
Partial Reinforcement Extinction Effect
PREE

In CRF - Extinction happens quickly.

In VRF Extinction happens slowly.

Does not happen on CC
PREE
Partial Reinforcement Extinction Effect
Does training transfer knowledge?
No, it changes probabilities.
What makes a trained dog sit when you say sit?
A history of reinforcement for sitting in response to the stimulus 'sit'.
Good thing starts
Positive reinforcement
Good thing ends
Negative punishment
Bad thing starts
Positive Punishment
Bad thing ends
Negative reinforcement
Training changes probabilities not ___________
knowledge
Rules for Good Desensitization
"1. Stay under threshold
Blocking
An already learned cue is attended to
Overshadowing
The more salient element in a compund is learned only
Good CER
"1: Order of events,
Spooky dogs
Dogs that have been working dogs until recently: Working, Guard, Flock Guard, toy
P Value
Probability that differences between groups occurred by chance. Usually done by comparing differences between a control group and a studied group and controlling all non testing variables. A p value of .05 means there is a 5% chance the difference between the groups is due to chance.
inclusive fitness
Genes are instructing her to save the copies of themselves tored in her kittens
why are we here
Each of us is descended from an unbroken line of successful reproducers
meta communications
using behavior to indicate what the following behavior is : a play bow indicates the next ripping run is a play move.
Four F's of behavior
Food, Fear, Fight an Sex : Adaptive significance, can fuel other
Group hunting's genetic legacy
Impacts socially facilitiated predation ⬦
Aggression reason for being
To displace individuals
Fear .. Bred?
Fear can be bred for, yes.
Fear in puppies
Genetics, Prenatal, Neonatal
Fear in adults
Genetics, Prenatal, Neonatal, socialization, sentization
Dog human aggression
"Strangers,
Dog Human Aggression treatment
"1. Habituation;
When to use flooding?
Puppy Mill Rescues, may be only choice ⬦
Desensitization
Exposure at sub threshold level so no fear is evoked, gradually increased
DRI Strangers/Dogs
Sit/Watch
DRI Guarding
Retrieve
DRI Handling
Offering Body Part
DRI Sofa Guarding
Voluntarily vacating locations
Differential OC/CC Fear/Aggression strategies
"OC: DRI - Operant ⬦ dog reinforced if he gives correct response ;
Difficult prognosis indicators aggression cases
hard mouth, strangers, port client compliance, explosive without a threat, large dog (>30lsb)
Good prognosis indicators aggression cases
**soft mouth, resource guarding, protracted warnings, ** committed owners, plastic dog, small dog (30lbs)
Serious bite level #
"IV - VII
Pressure or puncture more serious
Pressure
Less serious bite levels
"I-III :
Stranger aggresison is hard to fix because
"1). Recruiting;
Assessing bites: Bite History Incident
"Victim Characteristics;
Fear case prognosis
Slow moving - months and years ⬦. Younger the better
Dog-Dog reasons for problems
"Undersocialized⬦
Dog-Dog-Fix: Easy
Tarzan, guarding - mild, bullying
Dog-Dog-Fix: Good
"Play Skill Deficit;
Dog-Dog-Fix: Harder
"Proximity Sensitive: Severe;
Dog-Dog-Fix: Very hard
"Compulsive;
Predatory Drift
Size mismatch, double team, panic
ABI
??? Bite Inhibition
Family aggression usually manifests because
"Resource guarding,

Deck Info

172

permalink