eileenanddogs

Category: Premack Principle

Leaving the Scene: Clarifying the Science of Negative Reinforcement

Leaving the Scene: Clarifying the Science of Negative Reinforcement

Hares escape a lot (if they are lucky)
Hares escape a lot (if they are lucky) — Photo credit, Wikimedia Commons

Negative reinforcement is really, really easy to get mixed up about. Recently I read something that quite bothered me until I did a little research and figured it out. I’d like to share what I learned with you. What I’m talking about is this:

When I take my dog away from the thing he is concerned about, I am adding distance. Therefore this is positive reinforcement.

This is contrary to what you would read in most learning theory books, but it has its own seductive logic. We are primed to associate the word “add” with positive reinforcement (or positive punishment).  I love puzzles and problems, so I decided to do my best to tease out what, exactly, the problem with this statement is.

I  actually found two problems:

  • One is the confusion between positive and negative reinforcement.
  • The other is the focus on the word “distance” rather than the aversive thing itself.

Positive and Negative Reinforcement in Training

First, a review of definitions.

In positive reinforcement, the consequence of a behavior is the appearance of, or an increase in the intensity of, a stimulus. This stimulus, called a positive reinforcer, is ordinarily something the individual seeks out.–Paul Chance, Learning and Behavior, 7th edition

The stimulus can be lots of things. An object (food or toy), an event (door opening). The dog learns that performing a certain behavior (e.g. sitting), makes this thing available. If it is a desirable thing in that context, the dog sits more often.

In negative reinforcement, a behavior is strengthened by the removal, or a decrease in the intensity of, a stimulus. This stimulus, called a negative reinforcer, is ordinarily something the individual tries to escape or avoid.–Paul Chance, Learning and Behavior, 7th edition

Again, it can be an object (snake) or event (fire alarm ringing). The dog learns that performing a certain behavior makes the thing stop or retreat, or lets him get away from it. If the thing is aversive for the dog in that context, the behavior that makes it go away will increase.

The positive and negative reinforcement processes are pretty different. But there are a handful of situations where it may be hard to tell one from the other. But usually, you can do that by identifying the antecedent. If the antecedent is the presence of an aversive stimulus, and a behavior is increasing, you’ve got negative reinforcement.

There are some people who have claimed that there is not a large difference between the two types of reinforcement,** and there are others who use that point of view to excuse the use of aversive stimuli in training, or do mental gymnastics to convert such use to positive reinforcement.  But luckily a well-known behaviorist has tackled those claims.

Examples

Here is one of the “strained” examples and how one well known behaviorist approaches it. It has been argued that turning off an electric shock in response to an animal’s behavior is actually “adding a shock free environment,” (and that adding makes it positive reinforcement).  That argument has been neatly dismantled by Dr. Murray Sidman.

He reminds us that a positive reinforcer must be something the animal is willing to work (perform behavior) to get. And if you take the bad thing (in this case, shock) out of the picture, a “shock free” environment is meaningless and can’t even be defined, much less worked for.

Let’s apply that method as a litmus test to some other examples. Let’s take out the “icky” thing and see what we have left.

These examples involve either negative or positive reinforcement or both.

• First, food. If you remove the drive of hunger (relieving the state of hunger can be negative reinforcement), is food a positive reinforcer? Yes. Anyone who owns a dog or eats dessert knows that. And I’ve written a whole post about it. You don’t have to be hungry to enjoy and be willing to work for food. Eating food can be both negative and positive reinforcement. (But as I pointed out in my previous post, experiments indicate that the positive reinforcement process is more powerful.)

• Now, how about using an umbrella to protect oneself from the rain? There is an unpleasant condition (getting rained on), you perform the behavior of obtaining an umbrella and opening it over your head, and you escape the rain. This is in most textbooks as a classic example of negative reinforcement. What happens if we say that using the umbrella is really positive reinforcement because you are adding the state of “freedom from rain” or even “dryness”? Let’s follow Dr. Sidman’s lead and take away the rain or other unpleasant weather. Would the behavior of opening an umbrella over your head get reinforced by the “addition” of a dry condition?  No. Carrying around and opening an umbrella is a tiresome, expensive behavior. We wouldn’t do it to add something so ill-defined and meaningless.

A black scorpion -- photo credit, Wikimedia Commons
A black scorpion — photo credit, Wikimedia Commons

• So now to tackle the scenario in the subject. Escaping scary things. Let’s say you are phobic of scorpions. If you accidentally get close to one, it is a great relief to leave the area and go somewhere that you believe is scorpion-free. You escape the scorpion. Now, remove scorpions from the picture. Completely. Not just that scorpion, or even scorpions in general, but the threat or mildest hint of scorpions. They don’t exist. What does a scorpion-free environment look like? Well, anything, right?  As long as there are no scorpions. And is it a positive reinforcer? Well, for starters, we can’t even describe it.  Anytime you start thinking of the environment you ran to as reinforcing, it’s because you are comparing it to one with scorpions or some other scary thing.

The Huge Variety of “Scorpion-Free” Environments

A scorpion free environment
A scorpion free environment — photo credit, Wikimedia Commons

Your scorpion free environment could range from freezing to firestorms to 200 mph winds to vacuum to a 70 degree Sunday afternoon,  and all could be scorpion free. That makes the “scorpion-free” environment impossible to nail down to define.

Not only that, but Sidman points out that the environment could be changing.  As long as it doesn’t have a scorpion in it, it qualifies as scorpion free. But reinforcers generally need to sit still. A piece of meat doesn’t usually morph into a paperclip, nor does a book turn into a clock. If they did, they would lose their reinforcing qualities for hungry people or bookworms respectively.

Reinforcers are definable and describable. I don’t believe a “scorpion-free environment” is.  Compare that to the simple description of a scorpion (ick). That’s very concrete. And to the clear action of becoming aware of the scorpion and getting away from it. The scorpion is extremely well defined. And many people (including me) can’t get away from them fast enough.

Can Distance Itself be a Reinforcer or Punisher?

Here is the second problem. Even if you are in the camp that believes that any negative reinforcement situation can be equally argued to be positive, there is still a big problem. Most of us have learned that using the word “add” is an indicator that positive reinforcement (or punishment) is at play. Something is added to the environment after a behavior that in the future leads to the increase (or decrease) of the behavior.

So at first reading, the idea of adding distance or space from something scary sounds at least similar to positive reinforcement. You are adding something (maybe). But let’s go back to Sidman’s exercise. If you take the aversive, scary thing out, what you are “adding” is nothing at all. If you are standing at the 50-yard line in a football field and move to the 20-yard line, what have you added? The distance or space are meaningless except as escapes from the aversive.

So here’s the important part. It is not distance that is being added or removed.  Distance is an abstraction, not a stimulus. It is the aversive or reinforcer that is being added or removed. Distance is merely a description of the escape (or approach) process. The aversive is the scary monster. Only it can be removed or added. Controlling one’s distance from it is just a way of describing the mechanism of the appearance/disappearance of the aversive.

Also, it is a trick of wording. If you wanted, instead of “adding” distance you could say you were “removing” proximity. Focusing on distance is a red herring. And it neatly removes the real aversive from the picture.

I wrote the following in jest, but the more I think about it, the more I realize that what I have written for the other quadrants is exactly parallel to the claims about adding distance being positive reinforcement. It’s just that the other quadrants are easier to understand, so it’s easier to be aware of flawed logic.

Distance

These are the results if we treat distance as the focus. They are obviously untrue. That’s because the aversives and reinforcers are not “distance.” They are the scary monster, the cookie, and the stick. If you talk about “adding distance” what happens to the actual thing we are trying to get away from? It falls out of the equation.

So did this help at all? Does it make sense? Hope so!

Coming up:

Eileenanddogs on YouTube

**Note:

I should mention that there has been a movement for some time to do away with the distinctions between positive and negative reinforcement and positive and negative punishment. This is mostly because of the few challenging cases, and because the processes of reinforcement and punishment are difficult enough to discuss and explain without the plusses and minuses. But it is not usually argued that there is no difference between adding and subtracting at all. I’m working on a summary of that research, so stay tuned. But in the meantime–you will still see the plusses and minuses in any learning theory book.

 

The Perils of Premature Premack

The Perils of Premature Premack

Zani waiting at the back door
Zani waiting at the back door. Her reinforcement for this polite behavior is the opportunity to go outside.

Sacrilege!

Is it possible that in some cases, using the Premack principle in choosing reinforcement for our dogs is not the best choice? Can attempting Premack cause problems?1)*A technicality, but it’s important. Notice I haven’t said “Premack didn’t work.” That’s like saying that reinforcement didn’t work. Reinforcement is defined by its effects on future behavior. If the behavior didn’t increase, then there was no reinforcement. Likewise, you can’t say, “Premack didn’t work.” If what you tried didn’t reinforce the behavior, there was no Premack.

In my experience, yes. It can go wrong with some behaviors, with some dogs, and especially with some inexperienced trainers (yours truly takes a bow).

Premack’s principle states that more probable behaviors will reinforce less probable behaviors. In other words, you can use an activity the dog really enjoys to reinforce something that is ho-hum. You can reinforce a sit/stay with a tug session. You can reinforce sitting politely while the leash is attached with going for a walk. Premack is all about life rewards.

Premack is often suggested in situations when a dog really, really wants to do something, so much so that they are having a hard time with self control. My dog Zani loves little kids. If I wanted to apply Premack to this situation, I could use the opportunity to visit with them (if they were interested and it was OK with mom) to reinforce her walking calmly up to them without pulling. Visiting with children could be a more potent reinforcer than a a really good food treat for Zani. So the Premack principle can turn a distraction into a reinforcer. For my dog Summer, being brought close to children would be punishing. She’s nervous about them.

By the way, Premack applies to punishment, too. Many of David Premack’s experiments involved punishing a behavior by inducing the animal to subsequently perform an undesired behavior. We can think of examples of this easily in our life with dogs. If the only time we take a dog into a certain bathroom is to take a bath, and he hates baths (and we haven’t done anything to mitigate that), the behavior of walking nonchalantly into that bathroom with us will decrease.

There is one obvious answer to the question posed by my title. Premack is not a good choice when the behavior is never acceptable. For instance, my young dog Clara loves to pounce on and body slam my other dogs. She would love it if I allowed that, but of course I don’t. I teach incompatible behaviors and I interrupt it. And I try to give her opportunities for very physical play with me, with some firm ground rules.

But there is another situation in which Premack is not the best choice, and it can be hard to recognize, especially for pet owners and anyone who is trying to teach their dog without an in-person expert teacher.

In my experience, Premack may not be a good choice when the desired behavior triggers stress, arousal or a strong emotional response from the dog, or if the behavior results from these conditions.

Summer waiting at the back door
Summer waiting at the back door. What is wrong with this picture?

I think this can be an insidious problem, since behaviors and situations the dog gets really excited about are precisely what prompt people to recommend Premack. If you spend any time at all on dog training Internet discussion groups, you know that whenever someone describes something the dog is passionate about (squirrels) someone else is going to suggest using Premack. This advice comes as regular as clockwork. Give the dog contingent access to the squirrels.

I’ve gotten so I flinch every time I see those recommendations come rolling in. It may work out just fine. But the newbie trainer who is describing the problem may not have a correct assessment of the situation, and/or the skill to use the Premack reinforcer.

I can relate three personal experiences where Premack didn’t work out for me. And I mean, spectacularly didn’t work out. My own inexperience came into play in varying degrees, but that’s my point.

1. Reinforcing loose leash walking with a chance to run towards a squirrel, with my dog Summer. This was a disaster. I was brand new to training, but it seemed like such a good idea, made to order. What I didn’t know then was that Summer has a very high prey drive, is hyper vigilant, and very environmentally sensitive. I also didn’t know that I really needed to have taught her more about LLW itself (using food). But instead I jumped right into Premack. When we would see a squirrel I would require a few steps of LLW, followed by a quiet sit. Then I would release her and we would run together to the squirrel and she would lose her mind. When I got tired of circling the squirrel tree with her, I had to figure out a way to get her away. Her capability of going for a normal walk was completely gone by that point.

If you are going to allow the dog some kind of engagement with the environment as a reinforcer, I think there is a prerequisite to being able to make it work. You need a way to get them back, and it seems to me that you need to train this first. You need your dog to be able to recover from a potent emotional response fluently. These are challenging things to do, and usually not in place if you are having a big issue with distractions in the first place.

By the way, I used sniffing as a reinforcer for loose leash walking with moderate success with my dog Zani. I allowed stopping to calmly sniff as a reinforcer for walking nicely on leash. But in her case I had a little more experience than I had had when I tried it with Summer and the squirrels. I taught Zani a cue to go sniff, “Beagle!” And a cue to come back to my side, “With me!” I practiced the pair of behaviors in boring environments before taking it on the road, and I taught Zani the correct position for LLW to begin with with food.

2. Reinforcing Clara for not jumping up to lick my face by letting her lick my face on cue, with four paws on the floor. Ouch. Another newbie error on my part. It seemed like such a no brainer. I mean, if she is dying to jump up and get my face, that seems like a great candidate for Premack, right? Well in our case, wrong. I recently wrote a whole post about the face mugging problem and all the things I tried. I was well on my way to trying Premack when I thought to ask my teacher about it. She took a look at Clara, and said that her jumping up at my face did not look like a happy behavior. It was stress related. So even if I had succeeded in teaching her how to lick my face without the danger of breaking my jaw, I might have ended up with a situation like Summer at the door (see below).

3. Reinforcing sitting politely at the back door with going outside with Summer. This is a lovely method for two of my dogs, Zani and Clara. See Zani’s photo above. It is one of the most commonly recommended uses of the Premack principle in dog training. But again, it didn’t work for Summer. You would think that something she wanted so badly–to charge out into the yard checking for cats, squirrels, and other varmints–would cause a very prompt, snappy sit at the door. Not so. As you can see in the video, sometimes she can’t sit at all. And if she does sit,  she will not accept a treat. She is what is often called “over threshold.” She is anticipating what might be in the yard, and is having a big emotional response to that. She is also showing the fallout of years of conflict with me at the door. I didn’t cope with her behavior well, especially at first. I nagged her because I was completely oblivious to what was going on. I made the situation worse.

By the way, Zani is also at the door, and can be seen at 1:24 in the video in an exemplary calm sit, even though she is excited to go out, too. She is not drowned in excitement and stress hormones.

I fully acknowledge that a better trainer could have managed this situation better. She could have taught Summer first to be calm in the face of the potential excitement. Then worked up to using the Premack reinforcer when she could keep her wits about her. I should have aborted the project when my behavior was obviously stressing her out. But that’s my point. Premack is often recommended to beginners and to us non-professionals. And it can really backfire without some experienced eyes on what is happening. When I first started doing this years ago I had no idea why Summer’s sit was not more reliable. This method seemed to work for everybody else. To be perfectly frank, I read her body language as “sulky.” I thought she was being a bratty adolescent; moving slowly and giving me a dirty look because I didn’t let her out fast enough.

You might think that I would have run into a problem with her stress at the door just as badly if I had used food as the reinforcer for a calm sit. But using food diffuses Summer’s overexcitement, and doesn’t feed into it. (Many trainers have noted that food tends to have a calming effect when training behaviors, as opposed to using tug or other high arousal activities.) She has practiced her frozen shutdown, then running out in a frenzy for years now. But reinforcing a sit near the door with a high value food treat instead, and doing training sessions in this area of the house, are changing the potential reinforcement map in my favor. The excitement of the outdoors pales a little, which is good. She starts thinking of other ways she can earn the treat. Hmmm, how about reorienting to me after she goes through the door? Great!

Premack Successes

Let this post be a cautionary tale. But lest it appear that I am saying not to use Premack at all, let me mention some Premack reinforcers that have worked really well for me.

  • The two ball game: reinforcing Clara for releasing the ball by throwing another ball (this works with one ball, too, but was easier for me to teach with two)
  • Tug and flirt pole releases: reinforcing them with resumption of the game (I should mention that I don’t think I would have succeeded with this one without the help of my teacher, though)
  • Putting on the leash: gets reinforced by getting to go somewhere
  • Agility sequences: reinforced for Summer with play in the water hose
  • Loading into the the car crate: getting to go somewhere
  • Getting and staying in a down when I walk in the room with something in my hands: gets reinforced by getting to sniff what is in my hands (guess who: Clara)
  • Walking nicely on leash: reinforced by opportunities for Zani to sniff
  • Most behaviors: reinforced by eating food treats. Gotcha! Eating is a behavior. So really, everything is Premack.

I’m always discovering hidden genius in the Training Levels. Sue Ailsby talks about using Premack or life rewards plenty. She seems personally to be a master at transitioning to life rewards. But she uses food first. Using doors as an example: Level 1 Sit, Step 4: Dog sits by an open door. A whole Step dedicated to using food treats to teach the dog self control around a door. Level 3 Zen: this whole Level behavior is entirely about self control around doors, and you don’t send the dog charging out as a reinforcer once! Using food can diffuse the emotional potency of doors to the outside. It makes the door area just another training environment.

So now, almost 6 years into our relationship, Summer and I are spending a whole lot of time doing “silly dog tricks around doors.” To undo the problem I helped to create–with this particular dog–by trying to use the Premack principle first.

What about you all? Am I the only one who has made some poor Premack choices or implementations? And can anyone help me come up with a more general–or more specific–guideline for when Premack might not be the best idea? I don’t think I have ever seen this discussed online.

Thanks for reading!

Coming up soon:

 

Notes   [ + ]

1. *A technicality, but it’s important. Notice I haven’t said “Premack didn’t work.” That’s like saying that reinforcement didn’t work. Reinforcement is defined by its effects on future behavior. If the behavior didn’t increase, then there was no reinforcement. Likewise, you can’t say, “Premack didn’t work.” If what you tried didn’t reinforce the behavior, there was no Premack.
Get Out of My Face! Teaching an Incompatible Behavior

Get Out of My Face! Teaching an Incompatible Behavior

Ever since she arrived at my home at the age of 10 weeks, Clara has been a challenge.

One of her more problematic behaviors was her mugging of my face whenever it got within range. It happened all the time. How many times a day do you lean over your puppy, or lean over in her presence to pick up something off the floor? Most often something that she either dropped or shouldn’t have. Answer: a lot. Except not me, anymore, because she shaped me not to. If a strong, speedy puppy came barreling at your head every time you bent over, you might modify your behavior, too. So I do this embarrassing dance whenever I need to pick something up: distracting her, sneaking past, or trying to move REALLY FAST (which of course makes her all the more excited when she does catch me).

Young Clara mugging my face
Young Clara mugging my face

I took a stab at modifying her behavior early on, but I didn’t pick a viable method. What I did was treat it rather like a combination of a desensitization exercise and proofing a stay. I would put her in a sit stay and move over her very gradually, treating each movement. Slight lean, treat. Slight knee bend, treat. I did lots of sessions of this. Way too many for the good I got out of it. And while it may have helped somewhat with her being comfortable with those movements or the proximity of my face, it didn’t even begin to address the problem. I still had a small, then medium sized (then large, I admit it) puppy coming for me at the speed of light when I bent over. Because she wasn’t already in a stay to begin with. Duh.

Also sometime during her puppyhood I had another not so bright idea. I thought, Premack! Premack’s Principle states that more probable behaviors (bumping my face) can reinforce less probable behaviors (performing a sit stay when my face is close by). If she so strongly wants to lick and nuzzle and bump my face, wouldn’t that the ultimate reward for doing what I want first?

Does anyone see why this might not work, even if I could keep her from hurting me?

It was such a newbie error. I had never had a dog who got aroused this easily before. When your dog is excited, it is so easy to assume that she is happy. But the face licking is much more likely to be a stress and appeasement behavior.  I checked with my teacher, who knows Clara well and observed her. She said Clara did not look comfortable to her when doing the face seeking stuff. And that fits with the Clara I know, when I just stop to consider. She has a huge palette of appeasement behaviors and drops into those patterns at the drop of a hat.

So my idea was like saying to someone, “OK I see you bite your nails when you are nervous. Your reward after filling out this difficult form correctly is the opportunity to bite your nails.” OK, it might be just the thing. But a stress behavior like that has specific triggers, and is not always rewarding if those triggers aren’t there. After the form filling is done, the person may have no desire at all to bite their nails. In that case the chance to perform that behavior would not be reinforcing.

And that’s the reaction I got when I tried it with Clara. I got a good stay out of her, then knelt down and invited her to come lick my face. And got a big, fat “Huh?”

So the Premack experiment was short-lived. I should mention also that inviting a dog to come mug your face is, in many situations, not a good idea.  Lots of dogs are bothered by proximity of faces, and lots of bite incidents happen to people who thought their dog was fine with that kind of thing. And in any case, even if had worked it would have had the same problem as my desensitization approach. It didn’t address the problem directly because she was not already in a stay when my face approached.

So I quit and was basically living with it while I worked on things for which I got a better return on my time. One day I mentioned it to my teacher again while she was here at the house to work with Clara. I mentioned my gradual “stay” approach. She said she wouldn’t do it like that, instead, why not make bending over a cue to go to her crate? And in four repetitions of “new cue/old cue” little Clara was running to her crate when Lisa bent over.

In operant learning this is called “Differential Reinforcement of an Incompatible Behavior,” or DRI. It’s a widely used technique to get an animal (including a person) to stop doing something by making an incompatible behavior pay off really, really well. Clara cannot go straight to her crate and stay there and simultaneously leap up and mug a face.

Yargh, why didn’t I think of that? I said some rude things out of frustration if I recall.

But even then it didn’t make it to the top of my priority list. I played with it a couple if times, considering making bending over be a cue for crate or go to mat, but never got off the ground.

Clara still mugging my face

But I train Sue Ailsby’s Training Levels and one day there it was. Level 2 Down, Step 5. Teaching default cues. Is there a situation in which you would always like the dog automatically to lie down? Sue describes teaching a default down and stay when putting food dishes down, when meeting children or old people, or even when talking on the telephone.

Where do you need Level 2 Down? And the answer was obvious. Every time I lean over. I won’t always have a crate for her to go into, or a mat for her to get on. But by golly she can virtually always lie down. This finally gave me the incentive to do something about the behavior. So I used the New Cue/Old Cue method, as Lisa had done with the crate, and had the basic behavior in four iterations. (I think it went so quickly because it is much faster for a dog to go from a verbal to a body cue than the other way around.) After that it was just reminding her and expanding it into more difficult situations.

There are a few real life ramifications of my body cue for Clara’s down, and for once I may have thought them through. Mostly that if leaning over is a cue for down, I need to keep that in mind when practicing other behaviors, especially duration behaviors. If I have put her in a sit/stay and then lean over her, I have given her two conflicting cues. I can train her which one takes priority, but for now I’ll probably avoid that situation, while I’m strengthening the default down. If I were planning competition obedience with her or some other precise work where the difference between the two behaviors was crucial, I would need to choose another solution or else pay some keen attention to the discrimination/priority of the cues. But basically right now it is a very high priority to get her out of my face.

Anybody else have unusual cues for default behaviors? I’d love to hear about them.

Upcoming topics:

Thanks for reading!

Visit eileenanddogs on YouTube

Copyright Eileen Anderson 2013

Theme: Overlay by Kaira Extra Text
Cape Town, South Africa