eileenanddogs

Category: Operant conditioning

The “Invention” of Cues in Training

The “Invention” of Cues in Training

Hat made out of folded newspaper

Once upon a time, there was a girl who decided to teach her dog some tricks. She figured out that if she gave her dog something he liked after he did something she liked, he was liable to do the thing again. So she taught him some simple tricks using food and play as reinforcement.   

As she went along, her dog started finding playing training games lots of fun in and of themselves. But she still used food and play. He liked earning his “pay” and she liked giving it to him. She didn’t see any reason to stop.

This girl was unusual in that she didn’t try to tell her dog what to do in words. She realized what is not obvious to so many of us: he didn’t speak English. Things worked out just fine because he could generally discern from context and her gestures what she wanted to work on.

She used a little platform to teach him to pivot in a circle. He would put his front feet on the platform and walk around with his back feet and rotate. He got good at this and soon could spin in both directions. As soon as he saw the platform he would run over to it and start to pivot, although she could ask him to stop with a hand signal.

Continue reading “The “Invention” of Cues in Training”
Corrections Are Punishment (If They Work)

Corrections Are Punishment (If They Work)

Correction is a term used in certain segments of the dog training world. It commonly applies to jerking the dog’s leash (also called a “leash correction). Sometimes “correction” refers to other physical things people might do to a dog.

Trainers who use corrections do such things when a dog is performing an undesirable behavior. For example, they will perform a “leash correction” when a dog is pulling on the leash, is in the wrong position, or is not focused on the handler. The magnitude of a leash correction can range from a twitch of the leash to jerking hard enough to lift the dog partially off the ground or knock him off balance.

Continue reading “Corrections Are Punishment (If They Work)”
A Quadrant by Any Other Name is Still a Cornerstone of Operant Learning

A Quadrant by Any Other Name is Still a Cornerstone of Operant Learning

This 2003 edition book is $4.89 on Amazon. Contents: priceless.

There is a science that deals directly with how organisms learn and how to use that information to change the environment in order to change behavior. It’s called applied behavior analysis (ABA). It is the applied version of behavior analysis, which was referred to as the experimental analysis of behavior earlier in the 20th century.  It is descended from the work of the behaviorists such as Skinner and is a sub-discipline of psychology.

It is a rich field of study. Universities offer graduate degrees. At the same time, it is approachable. Many of the entry-level ABA college textbooks currently in use are readable to someone with a strong high school education and certainly to someone with a college education. They are generally self-contained, in that they don’t require a lot of previous exposure to terminology to be able to work through.  The books contain fascinating information about what makes us tick, why we do what we do, and how we might go about changing behavior if we needed to. They also teach skills in ethics and kindness.

Because they are written by experts in learning, the texts are generally well organized, interesting, and approachable. A sidebar in Paul Chance’s Learning and Behavior starts off, “What would you do if, while camping miles from the nearest hospital, you were bitten by a poisonous snake?” It goes on to discuss superstitious behavior. Other sidebars are titled “Punks and Skinheads,”  “Variable Ratio Harassment,” and “Learning from Lepers.” I’ll leave you to go find out the subject matter. This topic is a goldmine for the curious. It is relevant to everyday life and can teach knowledge and skills that are very practical. If you buy older editions of textbooks, as I usually do, the prices are quite reasonable. (For instance, here’s a link to Paul Chance’s Learning and Behavior, with the oldest editions first. You can scroll forward to newer editions as your pocketbook allows. The most recent edition is 2013.)

Like any field of study, ABA has its own terminology. When we first encounter it, two things typically happen. First, we think we know it already. Who doesn’t know what punishment is, right? Motivating operation—doesn’t sound too hard to figure out! Then we go a little deeper, and even though the words are familiar, the concepts may not be. Some are extremely unfamiliar. That can cause dismay. One of the problems in the dog training world is that a lot of people get stuck at that point.

Continue reading “A Quadrant by Any Other Name is Still a Cornerstone of Operant Learning”
Herrnstein’s Matching Law and Reinforcement Schedules

Herrnstein’s Matching Law and Reinforcement Schedules

Chocolate cookies on a cookie sheet. The baker may do other activities while the cookies are baking as long as she shows up at the right time. Her behavior follows the matching law.
When we bake cookies, some reinforcement is on a variable interval schedule.

Have you heard trainers talking about the matching law? This post covers a bit of its history and the nuts and bolts of what it is about. I am providing this rather technical article because I want something to link to in some other written pieces about how the matching law has affected my own training of my dogs.

Continue reading “Herrnstein’s Matching Law and Reinforcement Schedules”
Speeding Tickets: Negative or Positive Punishment?

Speeding Tickets: Negative or Positive Punishment?

Speeding tickets are commonly used as an example in learning theory textbooks. But I’m going to disagree with the typical classification because of my own experience. Here’s a true story.

When I was about 20, I was driving in my hometown. I was home from college and driving down my own street. I think I was going about 45. I think the speed limit was 35. I don’t remember why I was speeding. I didn’t commonly drive fast. But that day I did.

Continue reading “Speeding Tickets: Negative or Positive Punishment?”
Actually, I **Can** Get My Dogs’ Attention

Actually, I **Can** Get My Dogs’ Attention

I was thinking the other day about how and why I have a dream relationship with my dogs. They are cooperative. They are sweet. They are responsive and easy to live with. You know how I got there? Training and conditioning them with food and playing with them.

They weren’t the most difficult dogs in the world when they came to me, but they weren’t easy, either. Clara was a feral puppy who was growling at every human but me when she was 10 weeks old. Zani is so soft and sensitive that she would have been considered “untrainable” by many old-fashioned trainers. Plus she’s a hound, and you know you can’t get their attention when there is a scent around.

Yeah, actually you can.

Continue reading “Actually, I **Can** Get My Dogs’ Attention”
It’s Not Painful. It’s Not Scary. It Just Gets the Dog’s Attention!

It’s Not Painful. It’s Not Scary. It Just Gets the Dog’s Attention!

This is the short version of this post. Here is the longer version.

Some dog trainers who use tools such as shock, prong, or choke collars, or startle the dog with thrown objects or loud noises, claim that these things are done only for the purpose of “getting the dog’s attention.” They may further insist that the dog is not hurt, bothered, or scared.

Others, while well meaning, use a special sound or a “No!” to get their dogs to stop doing something. Not the worst thing in the world, but these people will try to argue you to the ground, insisting that the noise or word is “neutral.” They’ll say that it doesn’t carry any aversive effect, that it “just gets the dog’s attention.”

If only! This sounds like the Holy Grail of dog training. It’s the Magical Attention Signal! It can get your dog’s attention, get him to do something, or stop doing something, all rolled into one. You don’t have to use those pesky treats or toys, and it certainly doesn’t hurt or bother the dog!

Gosh, who wouldn’t want that? Life would be so easy with the Magical Attention Signal!

Unfortunately, the Magical Attention Signal is utter nonsense.

I have another version of this post in which I analyze the possibilities of the so-called Magical Attention Signal using learning theory. Feel free to check it out. Or read forward and get the story through some straightforward analogies.

Glumph

Imagine that you and I don’t share a common language or culture. But a friend in common has dropped you off to stay at my house for an afternoon.

You are looking around the house. You come into the bedroom and start looking through my jewelry box. I look up and casually say, “Glumph.” In my language, that means, “Please don’t bother my stuff; why don’t you go look around in the next room.” But you don’t know that. It was just a nonsense sound to you, so you keep looking through the jewelry. “Glumph” perhaps got your attention for a moment, but nothing else happened. It was a neutral stimulus. Now here’s where it gets interesting. What happens next?

Scenario 1: The “Neutral” Attention Signal

So what if nothing else happens besides my saying, “Glumph” every so often? If the jewelry (or my mail, or my wallet) is interesting, “Glumph” will not get your attention. In fact, the more I say it (staying in a neutral tone), the more it becomes part of the background. You habituate to it, and it loses even the tiny bit of attention-getting power it may have had at the beginning through novelty.

Outcome: “Glumph” is a neutral stimulus and doesn’t work to get attention.

Scenario 2: The Raised Voice

This is one of the likelier scenarios. After my first statement of “Glumph,” I say it again, but this time I raise my voice. I really need to interrupt you from going through my things! This time you are startled and you stop. Oops, the host is mad!

“Glumph” is now more effective. But how is it operating? It is interrupting you either because it is intrinsically startling, or because you know that yelling humans are more likely to harm you.

Outcome: “Glumph” is an interrupter operating through fear or threats.

Scenario 3: Taking Action

This is the most common scenario in dog training. What do I do after I say “Glumph,” conversationally to you, and you don’t stop what you are doing? I yell “Glumph,” I jump up, and physically stop you from going through my jewelry. I might do this a number of ways. Even though I’m upset, I might take you very gently away from my jewelry. Or I could do something less gentle. I could grab your hands or whack them. I could close the lid on your fingers. I could yell in your face. I could push you away. I could hit you.

So what does “Glumph” mean now? You will likely pay attention the next time I say or yell it. Because it means at the very least (the gentle scenario) you are going to lose access to the thing you are enjoying. But most likely you will have learned that my yelling “Glumph!” is a precursor to something unpleasant happening to you.

“Glumph” has become a punishment marker, and can operate as a threat.

A neutral stimulus by itself has no power, and the dog will habituate to it. If a word or noise works reliably to stop behaviors, it is not a neutral stimulus. It doesn’t just “get the dog’s attention” in a neutral way. It works because it is either intrinsically unpleasant or predicts unpleasantness.

Outcome: “Glumph” scares the dog or predicts something painful, scary, or otherwise unpleasant.

But Wait: There are Positive Interrupters!

Yes, thank goodness. There is a positive reinforcement based method for getting your dog to stop doing stuff. You can condition a positive interrupter.

Here’s a video by Emily Larlham that shows how to train a positive interrupter. Here’s a post about how I conditioned yelling at my dogs to be a positive thing for them—and it ended up having a similar effect.

But the thing is, the people who have conditioned a positive interrupter will tell you so. They can tell you the systematic process they went through to create it. They created it before they ever used it, not in the middle of difficult situations. They will emphatically not claim that their cue is a “neutral, attention-getting stimulus.” They know better. They implemented positive reinforcement.

 

No Magical Attention Signal

If someone says that Tool or Method A, B, or C is designed to “get the dog’s attention,” ask what happens next. Once they get the dog’s attention, how do they actually get the dog to do something or stop doing something? Also, ask them what happens if the first implementation of the tool fails to get the dog’s attention.

Many promoters of aversive methods in dog training don’t want to say that they hurt or scare or startle or nag or bully dogs. And our mythology about dogs is so strong that most of us want to believe them. Hence, the lure of the magic signal that works all by itself, with no other consequences. I hope this post will bolster your “nonsense detector.” Behavior is driven by consequences. If no change in consequences occurs, there is no reason for a behavior to change.

A woman with her back partially to the camera is sitting on a lawn. There is a wooden fence in the background. Three dogs are lying down nearby, all looking into her eyes.
Attention in the backyard, achieved with positive reinforcement

 

Copyright 2017, 2018 Eileen Anderson

Related Posts

 

Don’t Be Callous: How Punishment Can Go Wrong

Don’t Be Callous: How Punishment Can Go Wrong

This post includes discussion of animal experimentation from the 1950s and 1960s using shock. It is unpleasant to contemplate. But to me, it makes it even worse that the knowledge gained by those studies is not widely known. Studying that literature gives one a window on how punishment works. I hope you will read on.

The studies I cite are all included in current behavior science textbooks, and my descriptions are in accord with the textbooks’ conclusions. The conclusions are different from the common assumptions about punishment. 

Graph shows typical response to mild-to-moderate punishment. X axis represents sessions over time. Y axis is the suppression ratio. There is a drop in the behavior immediately after the aversive is applied, but the behavior gradually returns to its former level.
This is a typical response to application of a mild-to-moderate aversive. I created this graph because 1) I don’t have rights to the ones in textbooks, and 2) standard behavior change graphs are difficult to interpret if you are unfamiliar with them. I made a different type of graph, but what I have represented is the same response you see in the textbooks and research papers. The X-axis represents sessions over time. The Y-axis shows the ratio of behavioral decrease. The shape of the graph roughly correlates to the frequency of the behavior and shows that the suppression of behavior was only temporary.

I’ve written a lot about making humane choices in training and about the fallout that accompanies aversive methods. But the immediate risk of hurting, scaring, or bothering your dog is not the only problem with using aversives. It turns out that using positive punishment is tricky.

In the term positive punishment, positive doesn’t mean “good” or “upbeat.” In behavior science, it means the type of punishment in which something is added and a behavior decreases. The added thing is something the animal wants to avoid. If every time your dog sat you shocked her, played a painfully loud noise, or threw something at her, your dog would likely not sit as often.  Those things I mentioned would act as “aversive stimuli.” If the dog sat less after that, then punishment would have occurred.

There is another type of punishment called negative punishment. It consists of removing something the dog wants when they do something undesirable. I’m not discussing that type of punishment in this post. For the rest of the post, when I refer to punishment, I am referring to positive punishment.

The Punishment Callus

Some trainers and behavior professionals warn about something called the punishment callus. A punishment callus is not a physical callus. It is one name for the way that animals (including humans) can develop a tolerance for an aversive stimulus. When that tolerance is developed, that stimulus does not decrease behavior. It is not an effective punisher. The animal has become habituated to punishment.

This is not just a piece of folklore. It has been demonstrated repeatedly in studies, and it happens way more often than we realize in real life. I’m going to describe some of the research.

Reinforcement First

The first thing that happens in most punishment experiments is that the animal is taught a behavior using positive reinforcement. The pigeon learns to peck a disk to get some grain. The rat learns to press a lever or run down a chute to get food. There will be dozens, hundreds, or even thousands of repetitions. Then, after the behavior is strong, the researchers introduce punishment. This is usually in the form of shock. The shock is generally contingent on the animal touching the food or performing the behavior that gets access to the food.

At first glance, this seems weird, not to mention wildly unfair. Why would they be starting off a punishment study with reinforcement? Then why would they punish the same behavior?

Think about it a little and it makes sense. You can’t use punishment if you don’t have a behavior to punish. Reinforcement is what makes behaviors robust. You can’t measure the effects of unpleasant stimuli on a behavior unless you have a strong, consistent behavior to begin with.

In some studies, they cease the reinforcement after the punishment starts. In others, the reinforcement continues. In these experiments, the animals and birds get shocked for trying to get their food in the same way they learned to get it through many repetitions of positive reinforcement.

But this is not at all unique to lab experiments. A hard lesson here is that we do the same thing when we set out to punish a behavior. Animals behave because they get something of value (or are able to escape something icky). The behavior that the dog is performing that annoys us is there because it has been reinforced. It didn’t just appear out of the blue. So if we start to punish it, the animal is going to go through the same experience that the lab animals did. “Wait! This used to get me good stuff. Now something bad happens!” And punishment and reinforcement may happen together in real life, just as in some of the studies.

How We Imagine Punishment to Work

I think most of us have an image of punishment that goes something like this:

The dog has developed a behavior we find annoying. Let’s say he’s knocking over the trash can and going through the trash. The next time Fido does that, we catch him in the act. We sternly tell him, “No! Bad dog!” Or we hit him or throw something. (I hope it’s obvious I’m not recommending this.) The next time he does it, we do the same thing. In our minds, we have addressed the problem. In our mental image, the dog doesn’t do it anymore.

But. It. Doesn’t. Work. That. Way.

Real life and science agree on this. It’s much harder than that to get rid of a reinforced behavior.

Punishment Intensity

Many studies show that the effectiveness of a punishing stimulus correlates to its intensity (Boe and Church 1967).   The higher the intensity, the more the behavior decreases. Very high-intensity punishment correlates to long-term suppression.

Skinner was one of the first to discover that low-intensity punishment was ineffective. He taught rats to press a bar to get food. Then he discontinued the food and started to slap the rats’ paws when they pressed the bar. For about a day, the rats whose paws got slapped pressed the bar less than a control group. Then they caught up. Even though they were getting slapped, they pressed the bar just as often as the control rats (Skinner 1938). Other early punishment studies also used mild punishment, and for a while, it was assumed that all effects of punishment were very temporary (Skinner 1953). This was determined to be incorrect in later studies with higher intensity aversives.

Dog owners who try to use low-level punishment are faced with an immediate problem. Ironically, this situation usually comes from a desire to be kind. Many people do not feel comfortable doing anything to hurt or startle their dogs, but these are the methods they have been told to use. So they figure that they should start with a very low-intensity action. They’ll yell just loud enough to get the dog to stop. They’ll jerk the dog’s collar just enough to interrupt the pulling on leash. They’ll set the shock collar to the lowest setting.

But if a behavior is valuable enough to a dog (i.e., it gets reliably reinforced), a mild punishment will barely put a dent in it. It may interrupt the behavior at the moment and suppress it for a short time, and people are fooled into thinking it will continue to be effective. But it almost certainly won’t.

So the next thing the humans do when the dog performs the behavior is to raise the level of the punishment a bit. They yell louder, jerk harder, or turn up the dial on the shock collar.

Lather, rinse, repeat. If this pattern continues, the humans are successfully performing desensitization to punishment. The desensitization can continue up to extremely high levels of punishment. That is the punishment callus, and it has been excruciatingly well documented in the literature.

Miller’s Rats

In one study (Miller 1960), hungry rats were trained to run down a walled alleyway to get a moist pellet of food at the other end. The rats repeated this behavior many times as they got acclimated to the setup. Each rat’s speed of running down the alley was recorded as they gained fluency. The behavior of running down the alley was reinforced by access to food. This continued (without punishment) until the researchers determined that the rats had reached their maximum speed.

A shock mechanism was then initiated so the rats’ feet would get shocked when they touched the moist food. The rats were divided into two groups. They were referred to as the Gradual group and the Sudden group, indicating the way the shock was introduced. The Gradual group started with a shock of 125 Volts, which caused virtually no change in behavior. The shock was raised in each subsequent session. The rats’ speed slowed down somewhat each time the shock was raised. Then it recovered and leveled off as they got accustomed to the new intensity. The shock was raised in nine increments up to 335 Volts.

The rats in the Sudden group didn’t experience the gradual shocks. Their first introduction to the shock was at 335 Volts. Their movement down the alley slowed drastically. Often they would not touch the food.

In the last 140 trials (5 trials each for 28 rats total) the results were telling. Out of 70 trials at 335 Volts for the rats in the Gradual group, only 3 trials resulted in the rat not going all the way to the food. In the Sudden group at the same voltage, 43 trials, more than half resulted in the rat not going all the way to the food.

To repeat: These two groups of rats responded differently to shocks of the same high voltage due to how the shock was introduced.

Now take careful note of the differences in their behavior:

The [subjects] in the Gradual group flinched and sometimes squealed but remained at the goal and continued to eat. Those in the Sudden group seemed much more disturbed, lurching violently back, running away and crouching a distance from the goal (Miller 1960).

There’s the clincher. At 335 Volts, some rats were still approaching the food and eating while getting shocked. In other words, those behaviors were not effectively punished. For the other rats, the behaviors were definitely punished–and the rats were traumatized.

So there you have it. Two of the most common outcomes of using punishment are:

  • a spiral of ever-increasing punishment intensity that the animal learns to tolerate; or
  • a shut-down animal.

This information has been available for 50 years. Yet aversive techniques are still casually recommended to pet owners with no education in behavior science, no exposure to the mechanical skills involved, and most important, no clue of the harm to the animal.

Punishment meme

The Resilience of Behavior

One of the things I finally “got” about punishment as I studied the graphs in these studies is that complete cessation of a behavior is rare. Again, our mental image of the results of punishment is incorrect. In the Miller experiment, the traumatized rats in the Sudden group did sometimes approach and eat the food despite intense punishment. The rats in the Gradual group consistently did so.

The rats in the Gradual group correspond to dogs who are trained with gradually increasing punishment. They acclimate and the behavior continues. They get a punishment callus. The rats in the Sudden group probably resemble the heavily punished dogs I describe in my post Shut-Down Dogs, Part 2. 

One more thing about the graphs. When punishment is initiated or taken to a higher level, there is an immediate drop-off in behavior. It’s usually of short duration. The rate of behavior generally rises back up again.  This is what I modeled in the diagram above. You can see a bunch of these graphs in the Azrin study linked below.

Increasing the punishment intensity seems to have the same general effect as the initial addition of punishment. In both instances, the new punishment intensity produces a large suppression at the moment of changeover, with substantial recovery after continued exposure to this new intensity. Only at severe intensities of punishment has further increase failed to produce an abrupt decrease in responding (Azrin 1960).

One of the tragedies of this pattern in dog training is that the drop-off causes the human to believe the punishment is working. Raising the level of the punishment is reinforcing to the human.

The deliberate use of positive punishment as a training method is already ruled out of consideration for most positive reinforcement-based trainers. This is because of humane concerns and punishment’s known fallout. But I believe it is also important for us to know how difficult it would be to use effectively and that it does not work the way most of us imagine it to. We can see habituation to punishment all around us once we learn of its existence. My takeaway from the studies is how vastly superior and straightforward it is to build behavior in our pets than to try to squash it down.

Note: Please don’t quote this article to claim “punishment doesn’t work.” High-intensity punishment does work. But it has unacceptable side effects that can destroy our dogs’ happiness and wellbeing, not to mention their bonds with us.

References

Azrin, Nathan H. (1960). Effects of punishment intensity during variable‐interval reinforcement. Journal of the Experimental Analysis of Behavior 3(2), 123-142.

Boe, E. E., & Church, R. M. (1967). Permanent effects of punishment during extinction. Journal of Comparative and Physiological Psychology, 63(3), 486-492.

Miller, Neal E. (1960). Learning resistance to pain and fear: Effects of overlearning, exposure, and rewarded exposure in context. Journal of Experimental Psychology 60(3), 137-145.

Skinner, B. F. (1938). The behavior of organisms: an experimental analysis. Appleton-Century. New York.

Skinner, B. F. (1953). Science and human behavior. Simon and Schuster.

Copyright 2016 Eileen Anderson

Not All “Choices” Are Equal (Choice: Part 1)

Not All “Choices” Are Equal (Choice: Part 1)

Two paths diverging
Image credit: Wikimedia Commons

Shout-outs to Companion Animal Psychology for the post, The Right to Walk Away” which covers the effects of offering that particular choice in animal experiments, and encourages us to apply the concept to our animals’ lives. Also to Yvette Van Veen for her piece,  “A” Sucks “B” Stinks What Kind of Choice is That? , which definitely has some “rant” commonalities with this post of mine.

This is part 1 of a 2-part series. Part 2 is: The Dog’s Choice.

We positive reinforcement-based trainers often point out that our dogs have the choice not to participate in a training session. I think giving the animal “the right to walk away” is a good and humane practice. I also believe it’s only the first step of consideration of our animals’ self-determination.

Trainers who exclusively use aversives to train employ the language of choice as well. Shock trainers will say that the dog “is in control of the shock” and that the dog has a choice. In that case the choice is to comply–or not. Neither of the choices yields positive reinforcement. But these trainers too can honestly claim their dogs have choices.

Most of us would say that theirs is a pretty strained use of the term, “choice.” It’s a very stacked deck, and even the best option–successful avoidance–is not a fun one for the dog. But using the definitions of learning theory, neither of those situations–the positive reinforcement-based trainer giving the dog the right to leave, nor the shock-only trainer–would qualify as giving the animal a “free choice.” 

I’m going to argue here that limiting choices is intrinsic to the process of training an animal, whatever method we use. It’s the nature of the process. And it’s actually not “choices” or “no choices” that define a method’s humaneness.  It’s what kinds of choices are available within the structure we set up that determines how humane it is. 

We all stack the deck.

When anyone talks about giving their animal choices, I believe we need to ask questions.

  • What can the animal choose between?
  • What processes of learning are involved?
  • Is an aversive stimulus a focal point of the choice making?
  • What choices are ruled out?
  • Will the choices broaden later in training?

Not all choice situations are equal, and I think we need to knock off the instant happy dances anytime a person mentions “choice” in reference to training. Instead, I think we should ask, “What are the choices?”

How Much Choice Are We Giving?

How many times have you read one of the following instructions in a positive reinforcement group or forum? They are often addressed to new trainers, or trainers with puppies.

  • Be sure and begin your training in an area of low distraction.
  • Control other possible reinforcers.
  • If you can’t get the dog’s attention, start in the bathroom with the door closed and wait him out.
  • Don’t let the dog practice undesirable behaviors.
  • Watch out for bootleg reinforcers!

All of those are about limiting choices by removing the availability of reinforcers. We need to acknowledge the ways in which we do that. But there is no contradiction here. As trainers using primarily positive reinforcement, we are in the best position to look at the ways that this kind of choice management affects our dogs’ lives, and examine the ways we can move forward to a more choice-rich environment for them.

The Desirability of Choice

Many experiments have shown that animals and humans prefer having multiple paths to a reinforcer, and of course options for different reinforcers as well.

This is from a webpage that describes one of the important experiments with animals regarding choice. The experiment introduced some interesting nomenclature.

The classic experiment on preference for free choice was done by A. Charles Catania and Terje Sagvolden and published in 1980 in the Journal of the Experimental Analysis of Behavior, “Preference for Free Choice Over Forced Choice in Pigeons.”

The design was simple. In the first stage of each trial, pigeons could peck one of two keys. One key produced a “free choice” situation in which the pigeon saw a row of four keys: three green and one red. Pecks on the other key produced a “forced-choice” situation in which the pigeon saw one green key and three red keys. In either situation, pecking a green key produced food. Pecking a red key produced nothing. The arrangement of the colors varied from trial to trial.

Even though all the pigeons reliably pecked a green key in either situation, always earning food, they selected the free-choice situation about 70% of time. This shows that just having a choice is reinforcing, even if the rate of the reinforcement in both situations is exactly the same.  Behavior Analysis and Behaviorism Q & A

Another good article about the Catania experiments and other work on choice is, “On Choice, Preference, and Preference for Choice” by Toby Martin et al.

(In no way can this short post cover all the nuanced research about choice. For instance, abundance of choice has a downside, especially for humans. I am sticking to the issues of choice that are most applicable to the situations our companion animals find themselves in.)

“Forced Choice”

Note the definition of “forced choice” in the description of the experiments above. Nothing happened when the pigeon pecked the red key. The bird was not shocked or otherwise hurt. Forced choice was defined as a situation where only one behavior led to positive reinforcement (more correctly, a appetitive stimulus), and another behavior or behaviors led nowhere.

Having more than one behavioral path (in this case, multiple green keys to press) to get to the goodie was defined as “free choice.”

Now, think back to what we do in the early stages of training. Review my list above of the ways we remove “distractions,” i.e., other reinforcers. That type of training situation more closely resembles forced choice than free choice. The freedom to leave–especially in an environment that lacks other interesting stimuli–is not enough to designate a process as being free choice, at least in the nomenclature of this experiment and subsequent definitions in learning theory. But it’s a good first step.

Types of Choice

Here are the types of “choice” setups I see most commonly in dog training.

  1. Choice between different behaviors that lead to positive reinforcement. See examples below.
  2. Choice between handler-mediated positively reinforced behaviors and nothing in particular. This is the typical “they can walk away” type of positive reinforcement training session.
  3. Choice between different positively reinforced behaviors with an aversive present.  This can happen in exposure protocols if the trigger is close enough that it is at an aversive level. The proximity limits the value of positive reinforcement, and, if the aversive gets too close, eliminates it, because of the sympathetic “fight or flight” response.
  4. Choice between enduring an aversive stimulus and performing a behavior that allows escaping it. Most shock collar training exemplifies this, as do operant exposure protocols that put contingencies on escaping the trigger.
  5. Choice between behaviors that are positively reinforced and behaviors that are positively punished. A training situation such as “walk in heel position, get a cookie; surge forward, get a collar pop.”
  6. Choice between behaviors that are positively punished and behaviors that get nothing in particular. This would be across-the-board suppression of behavior.

In all that I listed, even #6, the dog can be said to have a choice. But none of them, with the exception of #1, would likely be called “free choice” in learning theory nomenclature.

Clara stops to smell the roses
Clara stops to smell the roses at the shopping mall

Now, about #1. The things I would tentatively put in the “free choice” bucket are:

  • Desensitization/counterconditioning with the trigger at a non-aversive level. The leash or other barrier prevents or controls the choice of movement towards the trigger, but there are no contingencies on behavior within the area and multiple reinforcers may be available.
  • Shaping, which can offer multiple choices of behavior along the path to a goal behavior.
  • Reinforcing offered behaviors in day-to-day life with an animal. (I’ll write about this in my followup post.)
  • Training techniques that allow the dog to leave in pursuit of another interest. However, these as well do tend to have a final goal of another behavior.

Note that we are not talking about using a variety of reinforcers. That’s easy to do in training. We are talking about different behaviors leading to reinforcement. When you are focused on a training goal, that one is a lot harder to include!

A Word About Preference

Preference is not the same as choice, though they are related.

From a review article about choice:

Preference is the relative strength of discriminated operants Researchers often measure preference as a pattern of choosing.  –Martin, Toby L., et al. “On choice, preference, and preference for choice.” The behavior analyst today 7.2 (2006): 234.

Pattern is a key word. I may not like my green tee-shirt very much, but I will choose it if my red ones are in the wash. It is only by observing my tee-shirt choice over time, noting circumstances and performing a bit of statistical analysis, that my choices will indicate my preference (red tee-shirts).

Observing our pets’ preferences, and giving them their preferred items, is a good and thoughtful thing, but doing so does not necessarily involve their making a choice.

In addition, I’ve written about how determining an animal’s preferences in a formal way can be more difficult than it sounds. But scientists are developing ways to determine choice in animals. The following article covers some of these:  Using Preference, Motivation, and Aversion Tests to Ask Scientific Questions about Animals’ Feelings.

Acknowledging Limitations on Choice

I think that when we talk about giving dogs choices, or describe protocols that supposedly do this, we should consider two things. First what are the choices? Are there multiple possibilities for positive reinforcement, or are there choices between positive reinforcement and nothing, or only crappy choices?

Second, we should consider how we are limiting choices. Are the limitations temporary or permanent? Are there ways we can give our dogs ways to express their preferences and make choices in their lives with us? Even in training?

There is no barb intended for positive reinforcement-based trainers in this post. Giving the animal the right to walk away is revolutionary in the recent training climate. We are the ones taking that step. Sometimes it’s the most control we can give them. But I believe we can do more.

Part 2 of this post will include my attempts–successful or not so–in giving my dogs choices in different situations.

How about you out there? In what ways do you give your animals choices–in day-to-day life or in training?

Articles Mentioned

Eileenanddogs on YouTube

But I’ve Seen Stressed-Out Dogs During Positive Reinforcement Training Too!

But I’ve Seen Stressed-Out Dogs During Positive Reinforcement Training Too!

Thank you to Jennifer Titus of CARE for Reactive Dogs for editorial advice. All errors and awkward moments are mine alone.

Citing “stressed-out R+ dogs” in an argument is an old chestnut that comes around regularly. The writer usually describes a training session he or she witnessed where a dog being trained with positive reinforcement was exhibiting fear or stress. The goal of sharing this description generally seems to be to blur the real differences between training that is based on positive reinforcement (R+) and training that is based on escape, avoidance, and punishment. Sometimes it is a feeble attempt to argue with the ranking of methods in assessments such as the Humane Hierarchy.

Cherry-picking a moment out of any dog’s life to support a general point about methods is tempting but is not effective argument.

Summer over the threshold of stimulus aversivness
My dog Summer showing stress during an R+ training session. What can we therefore conclude about the learning process called positive reinforcement? 

The “Stressed-Out” R+ Dog

So let’s consider the stressed-out dog in positive reinforcement training. What are some possible causes of stress in an R+ training session?

When using positive reinforcement, some metrics we use to assess the skill of the trainer and the effectiveness of the training are timing, criteria, and rate (or sometimes magnitude) of reinforcement. Let’s start our analysis there.

Bad timing can cause the dog some stress through lack of clarity. The trainer is marking and rewarding some incorrect behaviors while sometimes failing to reinforce some correct ones. If she cleans up her act and stops reinforcing the wrong stuff, the dog will go through an extinction process. Depending on the trainer’s skill, this can be stressful.

Raising criteria too fast means a higher failure rate. This can also cause some frustration. So while this is in an R+ training environment, what you have when you raise criteria too fast and the dog doesn’t do anything reinforceable is, again, an extinction problem.

If the rate of reinforcement is too low, you can actually put the desired behavior on extinction. So you may get a confused dog who starts throwing behaviors out of frustration, or a dog who will wander off and do something else more reinforcing, given the choice to do so.

Another stressor can be the use of negative punishment when the dog hasn’t learned the behavior. If the dog isn’t clear on how it can earn the reinforcer, it is frustrating to have it taken away contingently as it tries other things.

Note that none of the above errors is likely to hurt, scare, or startle the dog.

Two more types of stressors possible in an R+ training session are pressure of some type, and an accidental, momentary aversive. These two can indeed hurt, scare, or startle the dog, but are not linked to the positive reinforcement learning process.

  • What I’m calling pressure could consist of anything in the environment, setup, or even mannerisms of the trainer that the dog would like to escape from. Is something too loud? Is someone pressuring the dog with his or her body? Is the dog being kept too close to something she is scared of? This type of problem comes from the unwitting inclusion of an aversive stimulus.
  • Likewise, accidents happen, as they can in any training. A trainer might step on her dog’s tail during a stay, but again, this is an aversive accident, not an integral part of R+ training.

So our causes of stress are probably either technical mistakes on the trainer’s part or the presence of an unplanned or unrecognized aversive stimulus.  Are these problems unique to positive reinforcement training? Absolutely not. They can happen in training based on aversives just as easily.

A Fair Comparison

Let’s compare apples with apples. Rather than focusing on the stressors in faulty positive reinforcement training, lets compare the net effect on the dog of R+ training vs. aversive-based training–with both done poorly. There is certainly no shortage of sloppy training done with aversive methods. I can find such a video on YouTube within a couple of minutes, and  the trainer is often touting it as a success story.

So what happens to a dog being trained with escape/avoidance and punishment when the problems and errors I described above are present? Not only is the dog startled, hurt, intimidated, or at least irritated by the training itself, she will also be subjected to the additional stress resulting from trainer errors. Or she may experience aversives in addition to the ones the trainer is purposely using.

Here’s what it could look like.

  • Bad timing: Imagine popping a dog’s collar when she is heeling perfectly, in addition to popping her when she makes an error.
  • Changing criteria too fast: Imagine using duration shock to teach a dog to jump off a platform immediately after using it to teach her to jump on it.
  • Unplanned aversive stimulus: Imagine teaching stays using your hands to force a sound-sensitive dog to hold her position while a delivery truck with a no muffler drives by.

Those make the possible stressors in R+ training look rather like small potatoes, don’t they?

A Real-Life Example of the Results of R+ Training with Errors

I will be the guinea pig. I have a video of my own training that demonstrates many of the stressors I listed above.

In this popular video of mine that demonstrates lumping, I raise criteria too fast for Zani. She gets visibly frustrated. You can see it around 2:25 in particular. She plants herself in front of me in a sit and makes what I call the “terrier frustration noise.” A sharp exhale through her nose. I don’t blame her.

In addition to the training errors that are the subject of the video, there are more. I often mark late. I mark and reinforce improper behaviors, both when she targets my bare hand instead of the tape, or does a “drive-by” and doesn’t connect at all.

My rate of reinforcement is not bad, but there are a couple of times when Zani is going through extinction, trying other behaviors, where I might have interrupted her sooner, or marked something approaching the right behavior.

My reinforcement placement is not thoughtful. I am generally tossing the treat in order to reset Zani, but think how much faster she could have gotten to the wall if I had treated in that direction instead of away from it?

Another criterion issue is my poor choice of tape color. Gray, even metallic, is not a good contrast on a tan/yellow wall. Zani probably couldn’t see it well.

Interestingly, there is a subtle aversive stimulus in the session as well, and I think we can see the effects of it on Zani’s actions.  The tape on the wall is in a tight area.  I think her reluctance to enter that small area (in other words, an aversive setup) is one of the reasons she targets the desk multiple times instead of going for the tape. She is extremely pressure sensitive and I am asking her to go by me into a tight little space. She tries to avoid it.

So in one video, we have many of the problems I listed above.

Link to the Lumping video for email subscribers.

But even with the errors in the training and the slightly aversive setup, Zani hung in there with me and was wagging her tail in the last section. She successfully learned the behavior I was teaching and got 24 tasty food treats in the three minutes of training time shown. Not a bad rate at all, considering that there were two dry spells and also that she was spending a fair amount of time chasing down treats.

So here is a thought experiment. Imagine that instead of what you saw in the video, I used aversive methods to get the targeting behavior from Zani. You can imagine a combination of physical manipulation and body pressure, or a shock collar. No food in the picture. (If you are imagining Zani falling to pieces, that’s about right.) Now add to that multiple errors of timing and criteria, and an unwise setup that creates a tight space. How is Zani doing now?

That is a much fairer comparison of the results of different training methods.

The Proper Rejoinder

Evoking the scenario of the stressed-out R+ dog in argument invites the following response:

It’s a good thing the dog was being trained with positive reinforcement then. Adding training errors and aversive situations to any protocol can cause stress. Think how much worse it would have been if the dog were being deliberately trained with aversives to start off with!

The real illogic of the comment in the title is that in most examples described it’s the addition of aversive stimuli that creates stress. Blaming stress that results from the accidental inclusion of aversive stimuli on the process of positive reinforcement training is not only illogical; it’s a cheap shot.

Conclusions from Examples

Drawing conclusions from examples is tricky, and can easily lead to the logical fallacy of “missing the point.”

A couple of the valid conclusions that can be drawn from the “stressed-out R+ dog” scenario are that some positive reinforcement trainers lack mechanical or observational skills, and that it is possible for other learning processes besides positive reinforcement to be going on when we are trying to train with R+.

What the scenario doesn’t support is the idea that there is some unknown dark side intrinsic to positive reinforcement training, or that there are characteristics of training methods that are immune to analysis through learning theory, or that stressors from lack of skill happen only in R+ training, or that training based on the use of aversive stimuli can make for a happier dog.

Eileenanddogs on YouTube

© Eileen Anderson 2015                                                                                                                               eileenanddogs.com

Theme: Overlay by Kaira Extra Text
Cape Town, South Africa