Leaving the Scene: Clarifying the Science of Negative Reinforcement
Negative reinforcement is really, really easy to get mixed up about. Recently I read something that quite bothered me until I did a little research and figured it out. I’d like to share what I learned with you. What I’m talking about is this:
When I take my dog away from the thing he is concerned about, I am adding distance. Therefore this is positive reinforcement.
This is contrary to what you would read in most learning theory books, but it has its own seductive logic. We are primed to associate the word “add” with positive reinforcement (or positive punishment). I love puzzles and problems, so I decided to do my best to tease out what, exactly, the problem with this statement is.
I actually found two problems:
- One is the confusion between positive and negative reinforcement.
- The other is the focus on the word “distance” rather than the aversive thing itself.
Positive and Negative Reinforcement in Training
First, a review of definitions.
In positive reinforcement, the consequence of a behavior is the appearance of, or an increase in the intensity of, a stimulus. This stimulus, called a positive reinforcer, is ordinarily something the individual seeks out.–Paul Chance, Learning and Behavior, 7th edition
The stimulus can be lots of things. An object (food or toy), an event (door opening). The dog learns that performing a certain behavior (e.g. sitting), makes this thing available. If it is a desirable thing in that context, the dog sits more often.
In negative reinforcement, a behavior is strengthened by the removal, or a decrease in the intensity of, a stimulus. This stimulus, called a negative reinforcer, is ordinarily something the individual tries to escape or avoid.–Paul Chance, Learning and Behavior, 7th edition
Again, it can be an object (snake) or event (fire alarm ringing). The dog learns that performing a certain behavior makes the thing stop or retreat, or lets him get away from it. If the thing is aversive for the dog in that context, the behavior that makes it go away will increase.
The positive and negative reinforcement processes are pretty different. But there are a handful of situations where it may be hard to tell one from the other. But usually, you can do that by identifying the antecedent. If the antecedent is the presence of an aversive stimulus, and a behavior is increasing, you’ve got negative reinforcement.
There are some people who have claimed that there is not a large difference between the two types of reinforcement,** and there are others who use that point of view to excuse the use of aversive stimuli in training, or do mental gymnastics to convert such use to positive reinforcement. But luckily a well-known behaviorist has tackled those claims.
Here is one of the “strained” examples and how one well known behaviorist approaches it. It has been argued that turning off an electric shock in response to an animal’s behavior is actually “adding a shock free environment,” (and that adding makes it positive reinforcement). That argument has been neatly dismantled by Dr. Murray Sidman.
He reminds us that a positive reinforcer must be something the animal is willing to work (perform behavior) to get. And if you take the bad thing (in this case, shock) out of the picture, a “shock free” environment is meaningless and can’t even be defined, much less worked for.
Let’s apply that method as a litmus test to some other examples. Let’s take out the “icky” thing and see what we have left.
These examples involve either negative or positive reinforcement or both.
• First, food. If you remove the drive of hunger (relieving the state of hunger can be negative reinforcement), is food a positive reinforcer? Yes. Anyone who owns a dog or eats dessert knows that. And I’ve written a whole post about it. You don’t have to be hungry to enjoy and be willing to work for food. Eating food can be both negative and positive reinforcement. (But as I pointed out in my previous post, experiments indicate that the positive reinforcement process is more powerful.)
• Now, how about using an umbrella to protect oneself from the rain? There is an unpleasant condition (getting rained on), you perform the behavior of obtaining an umbrella and opening it over your head, and you escape the rain. This is in most textbooks as a classic example of negative reinforcement. What happens if we say that using the umbrella is really positive reinforcement because you are adding the state of “freedom from rain” or even “dryness”? Let’s follow Dr. Sidman’s lead and take away the rain or other unpleasant weather. Would the behavior of opening an umbrella over your head get reinforced by the “addition” of a dry condition? No. Carrying around and opening an umbrella is a tiresome, expensive behavior. We wouldn’t do it to add something so ill-defined and meaningless.
• So now to tackle the scenario in the subject. Escaping scary things. Let’s say you are phobic of scorpions. If you accidentally get close to one, it is a great relief to leave the area and go somewhere that you believe is scorpion-free. You escape the scorpion. Now, remove scorpions from the picture. Completely. Not just that scorpion, or even scorpions in general, but the threat or mildest hint of scorpions. They don’t exist. What does a scorpion-free environment look like? Well, anything, right? As long as there are no scorpions. And is it a positive reinforcer? Well, for starters, we can’t even describe it. Anytime you start thinking of the environment you ran to as reinforcing, it’s because you are comparing it to one with scorpions or some other scary thing.
The Huge Variety of “Scorpion-Free” Environments
Your scorpion free environment could range from freezing to firestorms to 200 mph winds to vacuum to a 70 degree Sunday afternoon, and all could be scorpion free. That makes the “scorpion-free” environment impossible to nail down to define.
Not only that, but Sidman points out that the environment could be changing. As long as it doesn’t have a scorpion in it, it qualifies as scorpion free. But reinforcers generally need to sit still. A piece of meat doesn’t usually morph into a paperclip, nor does a book turn into a clock. If they did, they would lose their reinforcing qualities for hungry people or bookworms respectively.
Reinforcers are definable and describable. I don’t believe a “scorpion-free environment” is. Compare that to the simple description of a scorpion (ick). That’s very concrete. And to the clear action of becoming aware of the scorpion and getting away from it. The scorpion is extremely well defined. And many people (including me) can’t get away from them fast enough.
Can Distance Itself be a Reinforcer or Punisher?
Here is the second problem. Even if you are in the camp that believes that any negative reinforcement situation can be equally argued to be positive, there is still a big problem. Most of us have learned that using the word “add” is an indicator that positive reinforcement (or punishment) is at play. Something is added to the environment after a behavior that in the future leads to the increase (or decrease) of the behavior.
So at first reading, the idea of adding distance or space from something scary sounds at least similar to positive reinforcement. You are adding something (maybe). But let’s go back to Sidman’s exercise. If you take the aversive, scary thing out, what you are “adding” is nothing at all. If you are standing at the 50-yard line in a football field and move to the 20-yard line, what have you added? The distance or space are meaningless except as escapes from the aversive.
So here’s the important part. It is not distance that is being added or removed. Distance is an abstraction, not a stimulus. It is the aversive or reinforcer that is being added or removed. Distance is merely a description of the escape (or approach) process. The aversive is the scary monster. Only it can be removed or added. Controlling one’s distance from it is just a way of describing the mechanism of the appearance/disappearance of the aversive.
Also, it is a trick of wording. If you wanted, instead of “adding” distance you could say you were “removing” proximity. Focusing on distance is a red herring. And it neatly removes the real aversive from the picture.
I wrote the following in jest, but the more I think about it, the more I realize that what I have written for the other quadrants is exactly parallel to the claims about adding distance being positive reinforcement. It’s just that the other quadrants are easier to understand, so it’s easier to be aware of flawed logic.
These are the results if we treat distance as the focus. They are obviously untrue. That’s because the aversives and reinforcers are not “distance.” They are the scary monster, the cookie, and the stick. If you talk about “adding distance” what happens to the actual thing we are trying to get away from? It falls out of the equation.
So did this help at all? Does it make sense? Hope so!
- How Skilled are you at Ignoring (Extinction, Part 2)?
- Shut Down Dogs Part 2
- Threshold: It May Not Be What You Think
- OMG Could She Really be Talking about the Continuum AGAIN?
I should mention that there has been a movement for some time to do away with the distinctions between positive and negative reinforcement and positive and negative punishment. This is mostly because of the few challenging cases, and because the processes of reinforcement and punishment are difficult enough to discuss and explain without the plusses and minuses. But it is not usually argued that there is no difference between adding and subtracting at all. I’m working on a summary of that research, so stay tuned. But in the meantime–you will still see the plusses and minuses in any learning theory book.