eileenanddogs

Tag: extinction burst

But Isn’t it Punishment to Withhold the Treat?

But Isn’t it Punishment to Withhold the Treat?

It would probably be good to decrease this behavior
It would probably be good to decrease this behavior–photo credit Wikimedia Commons

Lots and lots of people think that if you withhold the treat you are punishing the dog. Some will ask the above question in a gleeful, challenging way, feeling certain that they have caught the positive reinforcement based trainers in an inconsistency. But let’s see what is really happening.

Here is a scenario. In the past, you have given your puppy attention and played with him when he jumped on you. But he’s getting big and you really don’t want him jumping on you anymore. You decide to teach him to sit to greet you. He already has a good reinforcement history for sitting, so the likelihood that he will do it in any given situation is fairly high.

So here you are with your excited pup and you are clicking and giving a treat whenever he sits.

  • Pup sits. Click/treat.
  • Pup sits. Click/treat.
  • Pup jumps on you. Nothing.
  • Pup sits. Click/treat.
  • Pup sits. Click/treat.

OK, what happened when the pup chose to jump instead of sitting? You didn’t click. The treats stayed in your hand, your pocket, or the bowl. (You meanie!) You stood still and didn’t react. You are paying for sits, not jumping up.

But lo and behold, the jumping up starts to decrease! Decreasing behavior means punishment, right? You must have punished your puppy for jumping!

No. Let’s look at the definitions of positive and negative punishment.

Punishment

  • Positive punishment: Something is added after a behavior, which results in the behavior happening less often. Example:
    • Antecedent: You approach your dog.
    • Behavior: Dog jumps on you.
    • Consequence: You step on the dog’s back foot, hard. (I’m not recommending this, of course. Just want a clear example of positive punishment.)
    • Prediction: Jumping up on you will decrease.
  • Negative punishment: Something is removed after a behavior, which results in the behavior happening less often. Example:
    • Antecedent: You approach your dog.
    • Behavior: Dog jumps on you.
    • Consequence: You turn around and leave.
    • Prediction: Jumping up on you will decrease.

In the positive punishment example you added painful pressure to your dog’s foot. (Please don’t ever do this.) If the dog finds having his feet stepped on sufficiently painful, jumping will decrease. In the negative punishment example you removed your presence and attention from the dog.  If he likes your presence and attention well enough, and if you are consistent, (and if there is no competing reinforcer–that’s a big if!) this also will cause jumping on you to decrease.

So that’s what positive and negative punishment look like. Now back to our original example. Let’s map it out as well.

  • Antecedent: You approach your dog.
  • Behavior: Dog jumps on you.
  • Consequence: You just stand there.

You don’t respond with physical actions or increase or decrease your attention. Admittedly, this is hard to do, and remember, the lack of response has to be from the dog’s point of view. Even looking down at them is a response. Future blog on this point!

The cookies stayed put
The cookies staying put

Nothing was added: therefore no positive punishment. Nothing was removed: therefore no negative punishment.

(By the way, some people who are very new to learning theory think that the above example is negative reinforcement. Sit, give treat = positive reinforcement. Then jump, withhold treat = negative reinforcement. No, no, no! It has an attractive symmetry, but that is not what the term means at all. Here’s a review.)

So What Is Happening?

OK, back to the first scenario, where you are working on sits with your puppy. Let’s say that after that one time when the puppy jumped and you didn’t treat, the puppy didn’t jump up again. Jumping on you decreased during training. Let’s also say that that decrease continues over time. Why isn’t that punishment again?

Because punishment is not the only process that involves a decrease in behavior. There is another: extinction.

Extinction is the nonreinforcement of a previously reinforced response, the result of which is a decrease in the strength of that response.

In other words, extinction is what happens when the behavior you used to do to achieve some thing doesn’t work anymore. So you stop doing it.

So here comes the big question, especially for those folks who think they’ve somehow caught us out on the withholding the treat business.

How Humane is Extinction?

As with so many things, the answer is, “It depends.” But in this case there is a pretty clear demarcation. In the Humane Hierarchy, extinction by itself is at the same level of negative reinforcement (which involves an aversive) and negative punishment (which involves a penalty for behavior). Not great as first choices. We know that from life. If a machine we use all the time stops working, or a method we use of interacting with another person we care about suddenly gets no response with no explanation, we are left high and dry. It is not fun.

However, extinction also happens in tandem with a process called Differential Reinforcement of Alternative Behaviors (DRA). This is how trainers who aim to train primarily with positive reinforcement use it. (There are other differential reinforcement methods, but this is a good general one to discuss right now.) It consists of reinforcement of an alternative behavior while reinforcement for the target behavior is withheld. Done with some care and skill, it can involve very little frustration for the animal, and it is one step closer to the “most humane” end of the Humane Hierarchy. And this is what is happening in the example above. As long as the trainer is being quite clear that sits are being paid for, the fact that jumping up on her no longer gets attention is not so hard on the pup. He has another thing he can do to get something good. He gets attention and food.

The trainer has communicated to the pup a new behavior to “fill the hole” where jumping used to be.

Japanese Drink Vending Machine
 

I’m borrowing this great example of how DRA works from my friend Kim Pike. Let’s say the soda machine at a workplace is not working. People will push the button repeatedly. Some will perhaps pound on the machine or kick it. This is typical when extinction is in play by itself. The people have no alternative, and get frustrated. (I’ll be covering extinction bursts and and extinction aggression in a later post.) Gradually people will stop going to the machine and give up pushing the buttons. Individuals will probably forget, and now and then go try the machine again, then perhaps give it another kick or shove. But after a while no one goes to the machine anymore.

But when the soda machine is fixed, there will likely be a crowd of people ready to buy their sodas. It’s easier than going to the corner store, and involves less planning than bringing drinks from home. The behaviors attendant to getting a soda are all still fluent and easy for people to perform. And they once again get reinforced.

However! What if, when the machine broke, someone immediately set up a system where folks could buy a soda they liked as well or better for less money? Perhaps there was a cooler, or an honor system with soda in the fridge. If that alternative were in place immediately, would the thirsty people typically have experienced the same level of frustration at the broken machine? Nope! (Except perhaps for the engineers and mechanics, grin.)

And the most important question: What will the folks who just want a soda do when the machine gets fixed? As long as the cheaper, better alternative is still available, they will keep heading for it. The machine will have become irrelevant. Maybe once in a while someone will forget, and go to the machine. But they’d then remember that they can get a better drink, cheaper, out of the fridge.

This is what we are doing when we allow an extinction process in tandem with positive reinforcement of an alternative behavior. We clearly offer the animal an attractive alternative and remind them of it to keep it front and center. It’s important that the reinforcer for the new behavior be the same or better than that of the old behavior. This makes for a process with much less frustration.

Extinction in a Specific Circumstance

In my post, How Do I Tell My Dog She’s Wrong? I address “failing to click” during a training session. I feature a short video example from the great trainer Sue Ailsby teaching her young Portuguese Water Dog, Sync, to stand and stay. In the video you can see Sync’s immediate bounce back after the couple of times she tries something other than a stand and doesn’t earn a click.

In that case, sits and downs are not going to decrease into oblivion in every situation, as we might want the jumping up to do in our other example.  But they will go into extinction during training sessions of “Stand” and later when Sync learns a cue for it. Since  dogs can discriminate this easily, it also tells us that when we want a behavior to go away completely, we need to practice reinforcing our alternative behavior in many locations and situations.

Conclusion

So in answer to the critics, no, withholding the cookie in itself is not punishment. And if used in tandem with reinforcing another behavior, it is quite humane. If we put even a moderate amount of thought and planning into the situation, we can set the dog up to succeed. There will be minimal frustration when he does miss the mark on occasion and fails to earn the treat.

Stay tuned for Part 2 on extinction. I’ll be talking in more detail about what happens when extinction is used by itself, and comparing that with differential reinforcement in some human and dog case studies.

Related Page and Post

Eileenanddogs on YouTube

Shaping and Stress

Shaping and Stress

Zani rolling over in a shaping session that we both enjoyed

This is an expansion of a post about a possible cause of stress in shaping that I sent to the Training Levels Yahoo group.

Shaping involves extinction. That is, ceasing to reward something that has been repeatedly rewarded. In the real world, for humans and observably for other animals, that is stressful. The classic examples are when an elevator stops coming when the button is pushed, or when a candy machine just sits there after you put in the correct change and push the button. What usually follows? In the elevator case, repeated pushing of the button. Harder, faster. With the candy machine, all that, and possibly pounding, shaking, yelling. If you think about an animal’s behavior being tied to survival, something suddenly not working anymore is a danger signal. Oh oh, this place or this method that I was relying on no longer provides food. I’m going to have to start all over again and find somewhere or something else.

We are taught that when we suddenly stop rewarding something that a dog has been rewarded for, to be ready for an extinction burst. That is, the behavior rises in frequency and intensity before it fades away. Extinction is not fun for the dog in this circumstance! It is frustrating.

OK, back to shaping. When we shape, we are introducing tiny little extinctions over and over again. That’s how we get successive approximations to the final behavior.  “Fido, THAT behavior is not getting paid for anymore, it is up to you to figure out something that is.”

When I see the really great trainers shape, there is another characteristic besides their ability to detect the tiniest behaviors and differences in behaviors to reinforce. Another skill is that they are constantly watching the animal’s demeanor, as much as its actual movement, and are responding to that. They can keep that extinction process as gentle as possible and keep the animal trusting that the world hasn’t come to an end when they stop clicking for something. And of course these two skills go together. Seeing and responding to the tiniest movements does tend to keep the rate of reinforcement high.

Also they think empathetically. There is a clinically proven human tendency (the “curse of knowledge”) to assume that when we have something visualized or auralized in our heads, that the others around us automatically will see it, hear it, understand it quickly. Great teachers learn that this isn’t the case. And great shapers keep in mind all the time that the animal may not have a CLUE to what they themselves have so clearly in their heads.

Finally, with our pet, service, and performance dogs (i.e. dogs who live with us) it comes down to the trust account. It needs to be very high for some animals to enjoy shaping as much as we ourselves might. They have to trust us that the lack of a click, and a little extinction, is not the end of the world.  I will admit to making mistakes about this. Shaping is so cool; it’s like being handed a shiny new toolbox with all sorts of fun things inside. I’m a pretty empathetic person but I will tell you that I have gotten overexcited about this tool and plowed on through signs of big frustration from my animals. I have recordings that I will probably never show anyone else of shaping sessions I did very early on with both Zani and Clara. They went on for several minutes. We got to our goal (MY goal). But neither dog was having fun after the first minute or so. They were showing stress and frustration. Zani was whining. Clara was spinning, which is her superstitious and stress related behavior. I was pressing on towards the goal insensitively.

Another thing to keep in mind is that the shaping process is usually reinforcing–to the human! Shaping is incredibly cool! We dangle it in front of trainers who are considering “crossing over.” Look what you’ll get to do with your dog! Many of us need to be careful about going overboard.

Just like any other activity, some dogs are going to intrinsically enjoy shaping more than others. But we are trainers, right? If using shaping is important to us, we need to find ways to make sure it is fun for the dog. A little stress may be a good thing in life, but if an animal is chronically averse to training activity we like, it’s time to do something about it. We probably need to gentle down the extinction process. And mind our trust accounts.

A few thoughts on how to do this:

  • Watch the dog and get to know her signals.
  • Pay attention to how long you are waiting if you are withholding the click. That’s when the extinction stress can build up.
  • Start with very short sessions: just a few clicks.
  • Be willing to stop before achieving a pre-ordained goal (This is a hard one! We tend to be so goal oriented.)
  • Have an environmental cue that lets the animal know when you are shaping and when you aren’t.
  • My friend Lynn says, Teach it! Think of it from the learner’s point of view.
  • Lynn also says do little sessions of “shaping nonsense.” Make sure both you and the dog approach it as a game.
  • Don’t do like I did with Zani and start shaping with a brand new rescue dog just because you can. I wish I had built up our trust a little better before doing that.

Here are my three submissions to ShapeFest 2012 a few months ago. I’m pleased with my dogs’ demeanor in all of these. Clara is still the most serious,  but showed only a few little stress signs. Her main stress behavior is a counterclockwise spin. She does a couple of spins starting at 2:20 but it’s hard to tell how much is stress and how much is just a behavior she is trying. Since her pace is not frenetic, my guess it that they were mostly offered behaviors.

Shaping Zani to roll over

Shaping Summer to mount a platform, using playing in the hose as the reinforement

Shaping Clara to do a distant paw touch

I bet some of you out there have some other suggestions about making sure shaping is fun. Care to share?

Discussions coming up:

  • Is It Really Just a Tap? (shock collar content)
  • “Errorless learning”
  • Canine Cognitive Dysfunction

Thanks for reading!

Theme: Overlay by Kaira Extra Text
Cape Town, South Africa