[In operant learning], extinction means withholding the consequences that reinforce a behavior. –Paul Chance, Learning and Behavior, Fifth Edition, 2003
This post is Part 2 (a year later!) of But Isn’t it Punishment to Withhold the Treat?
In that post I discussed the common error of arguing that withholding a treat from a dog in a training session (or other time) comprises punishment. On the contrary, when nothing is contingently added or taken away but behavior decreases, the process at work is extinction, not punishment.
But that is not to say that extinction is automatically better. In Dr. Susan Friedman’s Humane Hierarchy for behavioral intervention (see graphic below), extinction by itself is at the same level as negative punishment and negative reinforcement. They are roughly at the same level of (un)desirability, and the level of unpleasantness of any particular technique would depend on the circumstance and individual animal. Dr. Friedman makes a point to say that these three are not ranked in any particular order of overall undesirability.
Extinction is often overlooked when considering or analyzing methods. People often mix it up with negative punishment. It’s a bit of an oddball learning process since it applies to both operant learning and respondent conditioning. In operant learning it is sometimes jokingly called the “fifth quadrant.” The important thing to me is that its unpleasant effects can vary wildly, from practically nil to complete misery.
Extinction can be very, very frustrating. Here you are with these behaviors that you have been performing for such a long time that they are habitual, and all of a sudden they don’t work anymore! But there are ways to use extinction in combination with other processes to make it much less hard on the learner. And in fact, if we look at the Humane Hierarchy a little more closely we will see that extinction is actually lurking in another of the levels, partnering with something much nicer.
Let’s explore this by way of a thought experiment.
Extinction Scenario #1. You get your car out of the shop after a tune-up. You buy a pint of your very favorite ice cream, or other perishable treat. You realize you need one more thing from the store, so you lock the car and go back in. When you come back out, you try to unlock your car with a remote. It doesn’t work! You press the remote again and again. You press it harder. You aim it differently. No go. Then you try unlocking the door with the key. That doesn’t work either! You jiggle and jiggle the key, and try the different doors. Nothing works. You bang on the car doors. You can’t get into the car using the methods that you have always used. You are starting to cuss now. Your ice cream is melting. You finally yell at the car and it opens!
You drive back to the repair shop and ask the guy what the heck he did to your car. He said your car doors now work by voice control. Apparently he thought that sending you off to find that out on your own would be the best way to teach you.
Two questions. 1) Was that learning process fun? 2) What are your feelings towards the mechanic?
That is a description of the process of extinction. A behavior that has previously been reinforced is no longer reinforced. In this case it was actually two behaviors: opening the car with the remote and opening it with the key. Both used to be reinforced by your gaining entry to the car. Both stopped working with no warning. Stinky!
Three characteristics of extinction are the extinction burst, an increased variability of behavior, and aggression. We got all three. When your normal methods for opening the car door didn’t work, there was a big burst of behavior from you as you tried stuff. You unconsciously started adding variety in how you performed the behaviors. And you started doing everything a little harder and banging on stuff. None of that was fun for you.
Now let’s try a different version of the scenario.
Scenario #2 When you first go to the mechanic, he tells you about a new option to have your car respond to voice commands, including that if you opt for the upgrade, in some cases the old methods will not work. You decide that it sounds good.* Your mechanic takes 10 minutes to go over the voice commands that you will use with your car, including that you practice unlocking it with your voice.
When you stop off to go to the store and return to your car, if you are like 99% of the human race, that huge reinforcement history for using your remote or keys during your whole driving career kicks in and you initially try to use one of these to open your car. But the practice of the new behavior is fresh in your mind, so as soon as the remote doesn’t work, you remember to give the voice command. Your car unlocks!
But old habits die hard. You will probably be hitting that remote or trying your keys for quite some time, each time you approach your car. The old behaviors will diminish slowly as their reinforcement histories fade into the past and the practice of the new successful behavior overshadows them. However, there will be comparatively little frustration. You are never in the dark about what behavior will actually work. You’ll probably perform the old behavior once, go “oops!” and immediately use your voice without wasting much time.
Not so bad!
What About Dog Training?
Here are the dog training corollaries to Scenario #1 and #2 above.
Let’s say you want to address the following behavior problem: When you get out your dog’s leash, your dog gets excited and runs around getting all aroused, barking and jumping on things.
Scenario #1 You have never trained your dog to do anything, but you’ve had enough of the overexcitement. So you decide you aren’t going out that door until your dog sits calmly for you to put the leash on. So you take your dog into the front room and pick up the leash. Dog runs around. You just stand there. Dog jumps on you and on the furniture. Runs around and barks. This goes on for about 5, maybe 10 minutes. Finally your dog wears out and sits down and looks at you. You take one step towards him, holding the leash out to attach it. He gets all worked up again and you have to wait out another few minutes of excited activity. This happens over and over.
From your dog’s point of view, the rules have changed. All that previous barking and running around have been reinforced by getting to go outside. Many people frankly don’t have the stamina to outwait a dog in this situation, and will finally break down and take the dog out anyway, which worsens the problem (by finally reinforcing the behavior they’ve made it more persistent). If you do succeed and the dog calms down in 20 minutes on that first day, it may take a bit less the next day. But since this is completely new to your dog and you are asking so much of him when he is already wildly excited, it will take a while, and be a frustrating process for him
Scenario #2 You have trained your dog to sit in all sorts of situations and for all sorts of reinforcers. He sits for his supper. He sits to go outside. He sits to greet people. He sits at the agility start line. He can hold a sit stay while you run around and play tug with another dog. So when you decide to teach him to sit calmly to put the leash on, you first practice some sits for treats in a random room of your house. Then you do the same in the room where you keep the leash. Then you pick up the dog’s leash and look at him expectantly. If he starts running around you wait. When he makes contact again you give him the expectant look. He will likely sit pretty soon. Treat!! He may jump up again when you approach him, but he is already learning.
This fits a pattern he is familiar with: sit and something good happens. You can use treats to reinforce those sits in this new situation so he doesn’t have to wait so long for the ultimate reinforcement, going out. You practice in small steps until you can put the dog’s leash on while he sits calmly. Depending on the dog and what you have trained, you may be able to take him straight out the door calmly that first day, or you may practice a few more days just putting the leash on and off before you go out the door.
Defining the Difference
Take a look at Dr. Friedman’s diagram again. See the area just below “Extinction, Negative Reinforcement, and Negative Punishment”? It is called “Differential reinforcement of alternative behaviors.” Guess what? That one corresponds exactly with both of the Scenarios #2 above. Dr. Friedman’s definition is, “Differential reinforcement is any procedure that combines extinction and reinforcement to change the frequency of a target behavior.”
Instead of being gobsmacked by the normal behavior not working anymore, the learners, dog and human, are given a big fat clue about what is going to work to get what they want. That clue is the positive reinforcement of an alternative behavior.
Extinction is part of all differential reinforcement training methods. Those methods are on a more humane rung of the hierarchy because the animal is given immediate opportunities for positive reinforcement. This can be done either by reinforcing successive approximations (shaping), or by separate practice of the desired behavior before it is evoked in the situation where the undesired behavior is likely.
So when someone says to you, “Neener neener neener, you use punishment when you withhold a treat,” say, “No, that’s extinction.” Then if they say, “Neener neener neener, you use extinction and that’s mean,” say “I use it in combination with differential positive reinforcement.” And make sure you do!
Be the mechanic who shows his client ahead of time what is going to work, instead of the one who sends him off with no clue.
* The car thing is a deliberately ridiculous scenario. Obviously, to cause a car’s keys and remote not to work would be horribly dangerous, and hardly anyone would consent to that even if it allowed one access to a new feature like voice commands.
Related Posts and Pages
I never got to the issue of “ignoring” in these extinction posts. So I guess there is going to be a Part 3.
© Eileen Anderson 2014 eileenanddogs.com