Month: February 2023

“The Negative Effects of Positive Reinforcement” by Michael Perone: Another Misrepresented Article

“The Negative Effects of Positive Reinforcement” by Michael Perone: Another Misrepresented Article

Three orange and red bags of Cheetos snacks are standing up in a row

Note: I have been working on this paper for 18 months. Today when I published it, I was unaware that Dr. Perone was the head of a recent task force that concluded that contingent electric skin shock of of a population that could include people with developmental disabilities,  emotional disorders, and autistic-like behaviors could be part of an “ethically sound treatment program.”  It casts his paper in a different light. I’m leaving my writeup published for now because I think we need these answers to what is an often quoted paper. Please don’t consider it in support of Dr. Perone in any way.

“The Negative Effects of Positive Reinforcement” by Dr. Michael Perone is a scholarly article some trainers like to use to muddy the waters about positive reinforcement training. They throw out Dr. Perone’s article title like a bogeyman and use it to defend aversive methods in dog training. That usually indicates they haven’t read it. It’s a thoughtful article and has some interesting things to consider, but it doesn’t say what they seem to think it does. Not even close.

I’m going to list here and summarize the effects of positive reinforcement mentioned in the article. I’ll summarize why they have almost nothing to do with well-executed dog training. They give us something to think about in our human lives. But they apply almost exclusively to humans and our lifestyles, and the ones that can apply to animals are easily avoided.

Positive Reinforcement Can Have Delayed Aversive Consequences

Perone attributes the first mention of these aversive consequences to Skinner and quotes him several times (1971, 1983).

Here’s what they are talking about. Let’s say I spend my whole weekend water-skiing. I may come home with a sunburn (but the sun felt so good!), sore or strained muscles (but every run was great!), and maybe even a hangover (gosh that socializing was the best!). Don’t drink and boat, folks, this is just an example. I may be so wrung out after my fun weekend that I won’t have enough energy to finish the report I was supposed to have completed by Monday. All the things I did were fun and reinforcing at the time and I kept doing them, to the detriment of my body.

These potential longer-term aversive effects are one category of “negative effects” Perone is talking about.

How much do they apply to positive reinforcement-based animal training? Hardly at all! We don’t choose training methods and activities with delayed aversive consequences. As animal guardians, we aim to protect our animals from such consequences in both training and the rest of their lives. For example, we don’t let dogs overdo playing in the water hose—we don’t want to risk obsession or water intoxication. We don’t let a dog with an injury play endless games of fetch, even if they beg us. We interrupt dogs playing with each other when they begin to ramp up into over-arousal. The equivalent of my water-skiing weekend shouldn’t happen.

Perone quotes Skinner about activities that are so reinforcing they exhaust him. Skinner wrote, “Fatigue is a ridiculous hangover from too much reinforcement” (1983). He was concerned that the attraction of highly reinforcing activities would prevent him from more important activities with less immediate reinforcement. This is a crucial concern for any human with control over their activity choices, and one many of us wrestle with for most of our lives. Should I do the immediate fun thing or the less fun thing that has good results over time?

But this is unlikely to be a concern for positive reinforcement-based animal trainers. On the contrary, well-executed positive reinforcement training is a highly reinforcing activity for both the human and animal. It also has delayed positive consequences for both parties.

Do I even need to point out that aversive methods often have long-term aversive consequences, even deadly consequences? There is just no comparison.

Positive Reinforcement Can Make People Vulnerable to Exploitation by Government and Business.

This is true. Exploiters can use positive reinforcement (praise, social acceptance, money, tangible items) to draw people into dangerous or unfair situations from which they can’t escape. This happens on the large scale but also on the small, interpersonal scale. This danger, again, has very little application to training animals or to our lives with animals. We already have a ton of control over their lives, even those of us who do our best to give our animals freedom. We work hard to make even the onerous experiences of life fun for our animals. Things such as some husbandry activities, taking meds, and physical therapy. And we use positive reinforcement to give the animal more choices, more opportunities, a wider world. Plus remember: it’s fun.

Some Reinforcing Activities Naturally Have Delayed Aversive Consequences

This is a reiteration of the first point, but Perone includes a list of “more mundane” activities for short-term pleasure here.

Positive reinforcement is implicated in eating junk food instead of a balanced meal, watching television instead of exercising, buying instead of saving, playing instead of working, or working instead of spending time with one’s family. Positive reinforcement underlies our propensity toward heart disease, cancer, and other diseases that are related more to maladaptive lifestyles than to purely physiological or anatomical weaknesses.

Perone, 2003, referencing Skinner, 1971

Of course!

Here is my own example: Let’s say I eat a whole bag of Cheetos because they are engineered to taste good and cause me to want more and more. The behaviors of reaching into the bag or the bowl and putting a piece in my mouth and all other behaviors that get those Cheetos ingested are immediately and powerfully reinforced. Delayed aversive consequences can include stomachache, bloating, poor nutrition, and that “ick” feeling. Oh yeah, and getting the orange stuff all over my fingers. (See big important note at the bottom of the post. I am not food- or body-shaming here.)

Again, this doesn’t apply to animal training or living with our pets. For instance, with both horses and dogs, we educate ourselves about bloat and do our best to prevent the circumstances that can cause it. And I’m pretty sure I don’t have a single positive reinforcement dog training friend who would let their dog eat a whole bag of Cheetos.

But once during an agility trial, I gave Zani too many rich treats over the course of the day. On our last run, she had diarrhea in the ring. Was my conclusion, “Welp, better stop using positive reinforcement”? Of course not. My conclusion was, “You asshole, you made your dog sick with that Braunschweiger. It could have even been worse; dogs can suffer or even die of pancreatitis from too much fatty food. Don’t do that again.”


Aspects of Positive Reinforcement Schedules Can Be Aversive

Top-down view of a pigeon pecking a yellow button in a Skinner box

Perone describes two studies identifying aspects of positive reinforcement schedules that can be aversive. Yes, in a controlled laboratory environment, we can test to see whether an animal will work to avoid a certain positive reinforcement schedule.

In the first study, the researchers studied the effects on pigeons of a change from a rich reinforcement schedule (Variable Interval 30 seconds) to a leaner one (VI 120 seconds). With some clever indicators to the pigeons of which schedule was in effect, they showed the leaner schedule was an aversive condition compared to the richer schedule and that indicators of the leaner schedule could act as conditioned punishers (Jwaideh & Mulvaney, 1976).

In the second study, pigeons were taught to recognize predictors of changes in reinforcement schedules and reinforcer magnitude. They were given the option to “escape,” to peck a key that would stop the trial until they pecked it again. When the trial was stopped, the indicator lights changed, the “house-light” color and intensity changed, and no pecks on any keys were reinforced. It turned out that within a schedule, the pigeons were most likely to take a time-out just after being reinforced. During schedule transitions, the pigeons were most likely to take a time-out when the indicators told them they were switching from high magnitude reinforcers to lower magnitude reinforcers (Everly et al., 2014). These situations meet the criteria for aversiveness because the birds were opting to escape, to “quit the game” for a time.

These are valuable lessons. It’s important to note that these were “free operant” experiments, rather than the discrete trials we generally use in training. This post discusses the difference. In life, we should have very few situations in which we make large step-downs in reinforcer magnitude or frequency for the same behavior. But it can happen by accident or out of ignorance. If there is likely to be a step-down of this sort, we need to take action about it.

Sable dog trotting toward camera with her mouth open and tail up (looking happy)
Summer in a competitive rally run

The example that comes to mind is competitive obedience. I used to compete in rally obedience with my dog Summer. While learning and practicing, I generally reinforced (and reinforced well, with meat or cheese) every behavior. Then I carefully stepped down to every second or third behavior. This was OK with her, and she maintained her enthusiasm. But what would have happened if, at that point, I had suddenly taken her into an obedience ring and performed a minute-and-a-half-long run of 25 behaviors with no reinforcement until the end? Well, maybe nothing bad performance-wise the first time. Her behaviors were strong and resistant to extinction. But it wouldn’t have been kind, and over time (it doesn’t take much time at all!) she would have learned the trial environment predicted no goodies while in the ring. This happened to a lot of dogs before skilled positive reinforcement trainers entered the obedience world.

Thanks to modern dog training methods, we now know lots of ways to make the ring experience happier for the dog and not have that huge step-down in fun. These include using conditioned reinforcers and putting some thought into our reinforcement schedules. Luckily, I had good teachers. What I did was gradually wean Summer from intermittent treats during the run during practice while teaching her she would get a mega-treat (a whole jar of chicken baby food) at the end of the run. We even practiced a fun “hurry from the ring to our crating area to get the treat” sequence as part of the routine when preparing. Believe me, this switch did not diminish her interest and happiness with rally at all! And I was able to do the same during trials, so trials didn’t predict a leaner schedule to her.

Conclusion

Please note what I have not said here. I have not said that training with positive reinforcement has no possible negative consequences. It can. When we humans hold access to all the good stuff, it takes a mindful approach to avoid coercion. But if we are positive reinforcement-based trainers, avoiding coercion is already a top goal. Schedule effects such as Perone describes are a very good thing for us to learn about to provide the best, happiest experience for our animals. Punitive schedule changes can be avoided.

In the meantime, keep in mind that the negative side effects of positive reinforcement training listed in this article by Perone are minimal in animal training. These effects are not at all comparable to the potential fallout from force-based training, which can ruin the lives of dogs and destroy relationships.

The title of the article causes some trainers who use highly aversive methods to hope it can work as a “gotcha” to support their stance. “Look, positive reinforcement is just as bad!” Except it doesn’t show that at all, and they would know if they had read it. Or they do know, and expect you not to read it. Next time you see it referenced, feel free to link to this post.

Training with positive reinforcement, even moderately well, is unlikely to have delayed aversive effects. It’s more likely to have both current and delayed beneficial effects.

A Note about Cheetos

I eat Cheetos and other snack foods. I’m aware they are engineered to be extremely tasty but not satisfying, so we eat more. I eat them anyway. I don’t food shame anybody. I don’t idealize thin body types. I hope everyone reading has the resources to treat themselves to plenty of their preferred pleasures in life, both short-term and long-term.

Further Reading

I find this article by Balsam and Bondy, The Negative Side Effects of Reward, a far better discussion of challenges we might encounter when doing positive reinforcement training. Before you get worried: this article is not at all damning of positive reinforcement-based animal training either. It gives some very practical information about challenges we already recognize. For instance, if you use a powerful food reinforcer, you may get more “food approaching” behavior than the behavior you are trying to capture and reinforce. (“My dog is distracted by the food!”) This is a fairly minor training challenge. The other points in the article are similar. Again, the negative side effects” are not at all comparable to the fallout associated with force-based training.

Also, for advanced reading and more information about how to make positive reinforcement training the best it can possibly be, take a look at Nonlinear Contingency Analysis by Layng, Andronis, Codd, and Abdel-Jalil (2021).

Thank you to my well-qualified friend who looked over my post. All mistakes, of course, are my own.

Related Post

References

Balsam, P. D., & Bondy, A. S. (1983). The negative side effects of reward. Journal of Applied Behavior Analysis16(3), 283-296.

Everly, J. B., Holtyn, A. F., & Perone, M. (2014). Behavioral functions of stimuli signaling transitions across rich and lean schedules of reinforcement. Journal of the Experimental Analysis of Behavior101(2), 201-214.

Jwaideh, A. R., & Mulvaney, D. E. (1976). Punishment of observing by a stimulus associated with the lower of two reinforcement frequencies. Learning and Motivation, 7, 211- 222.

Layng, T. J., Andronis, P. T., Codd, R. T., & Abdel-Jalil, A. (2021). Nonlinear contingency analysis: Going beyond cognition and behavior in clinical practice. Routledge.

Perone, M. (2003). Negative effects of positive reinforcement. The Behavior Analyst26, 1-14.

Skinner, B. F (1971). Beyond freedom and dignity. New York: Knopf.

Skinner, B. F. (1983). A matter of consequences. New York: Knopf.

Positive Punishment—With the Touch of a Cotton Ball

Positive Punishment—With the Touch of a Cotton Ball

a white ball of cotton on a black background

I accidentally punished my dog’s behavior with a wad of cotton.

Lewis and I are participating in Dr. Mindy Waite’s husbandry study on ear cleaning. The goal of the study is to be able to wipe the dog’s ear with some cleaning solution on a cotton ball while he happily cooperates.

The protocol starts with training a chin rest. We completed the steps for the chin rest in my lap on a little towel (see below), then I proceeded to the steps of lifting his ear, bringing my other hand close to his ear while it was lifted, moving a dry cotton ball toward his ear, then touching the cotton to the inside of the ear flap.

For each of these steps, he needed to maintain his position while I performed the task and not chain in other undesirable behaviors before putting his chin down, such as pawing at me or vocalizing. (Yes, both happened.)

When I performed the step where I touched the cotton ball to the inside of his ear, he started to dodge away after the touch. I was able to reinforce a few, but most didn’t meet criteria. I did two brief sessions of this as the rate of reinforcement dropped, then I stopped and regrouped with Dr. Waite.

We agreed I would back up a step, to a step that didn’t involve touching his ear. In this step, I brought the cotton ball to within one inch of his ear without touching it.

But the previous sessions had created a behavior change through an aversive event. Here is the fallout.

I sat down in the chair I always use, readied the cotton, and put the small towel that functions as a chin target in my lap. .

• Lewis approached the towel and paused
• He put his chin down
• When I reached to lift his ear, he flinched away
• He lowered his chin again, I lifted his ear, and brought the cotton ball to about 1 inch from his ear
• I clicked and treated as he pulled quickly away (oops, not good timing!)
• He started to lower his chin, then dodged away as I reached for his ear with one hand and the cotton ball with the other
• He put his chin down once more, then quickly lifted it away 
• Then, while still standing in front of me, he turned his head 90 degrees to his right
• Then he turned his head 90 degrees to his left
• He looked at me
• He turned to his right again
• He walked away
• He walked around, mostly off camera, for 24 seconds
• He put his paws up on a table, I cued him to get down, and he did
• He sniffed around for 15 more seconds, then headed back to me
• He put his chin on the towel but immediately lifted his head
• He looked hard to the right and walked away again
• He returned, put his chin down, immediately picked it up, licked his lips, and looked away
• He sniffed and licked his anus
• He sniffed and licked his anus from the other side
• He walked a few steps away and did another anal check
• He came over to me and nuzzled my hand holding the cotton (thanks for doing that right after licking your butt, Lewis)
• He put his chin down on the towel, I lifted his ear, brought the cotton ball close, and clicked/treated
• I ended the session (something I should have done much earlier)

Negative Reinforcement

The list above is a roll call of escape and avoidance behaviors. These behaviors got negatively reinforced because they released him from my approach to his ear. Also, this avoidance was highly unusual for Lewis. I’ve never seen him leave a training session before. The touch of the cotton the previous day was so aversive he couldn’t even tolerate the proximity of it during this session.

But there was a more subtle, even more important result.

Positive Punishment

A mostly white dog stands still with his head in the lap of a woman who is sitting on a low table. the woman is lifting the dog's right ear

Lewis’ chin rest was positively punished by the soft touch of the cotton ball to the inside of his ear. How do I know that? Here’s my analysis.

This preceded the big-time avoidance session I describe above.

Antecedent: I put the towel in my lap, which is the cue for him to put his chin on it
Behavior: Lewis put his chin on the towel
Consequence: I lifted his ear and touched the inside with a dry cotton ball

I would normally put a prediction here. I’m not because I didn’t make one at the time. But I sure found out.

Lewis’ rate of offering the chin rest after that experience dramatically decreased. Before this step, Lewis had been fluently performing repeated chin rests for days in our sessions. And during this “aftermath” session I got only two, with high latency and the fidgety, avoidant behaviors I described.

Some of you will be wondering about another possibility. Good thinking! Positive punishment is not the only mechanism by which a behavior might decrease. There are two other well-known possibilities (and some less common ones). The well-known ones are negative punishment and extinction.

I can’t think of a way that negative punishment could be the process involved here. I was not contingently removing an appetitive stimulus. (Check out this article if you need a brush-up on the processes of operant learning.) But we need to consider extinction, because the rate of reinforcement did drastically drop.

Remember above I mentioned that part of the criteria for this husbandry task was for Lewis not to chain in undesirable behaviors like pawing me or vocalizing? Those started off as extinction-related behaviors. They happened when he was frustrated that I had raised criteria too fast and he wasn’t accessing reinforcement at the former rate.

I know what extinction-related behavior looks like from Lewis, and this wasn’t it.

That Innocent Little Cotton Ball

a closeup of a white cotton ball on a black background

I didn’t start out to be misleading about the cotton, but I ended up burying the lede about it a little. Here is the lede: Touching a cotton ball directly on the opening of the ear canal is LOUD. Cotton balls look soft and innocuous, and touching them to lots of places on the body could be unremarkable. But the sound of the cotton touching his ear was startling and unpleasant to Lewis

Dr. Waite kept that in mind as she gave us some new steps to perform. They separated out the sound from the touch. We went pretty far back in the process, because this strong avoidant response was so unusual for Lewis. I’m pleased to report that we have worked back up to touching his ear with the cotton ball and he has decided that’s OK!

Watch the Dog and Film Your Reps

There’s a lesson here. You don’t need a shock or prong or choke collar for positive punishment. I think we concentrate way too much on tools sometimes. I don’t use any such collars, I don’t jerk on the leash, nor do I use front attach harnesses or head halters, but frankly, those make up a pretty low bar. It’s a baseline. It’s not challenging or difficult not to use tools that hurt. What I find much more challenging is noticing other stimuli that are unexpectedly aversive, such something very soft applied to the wrong place the wrong way.

Lewis was not hurt, harmed, or traumatized by the cotton ball, as he would have been with a harsher method. But he was unhappy with the situation, which is the opposite of what I want. If I hadn’t been filming, I might never have known just how unhappy he was. And if I hadn’t been tracking reps, I might not have noticed that the chin rest got punished. I’m glad I did know, even if a little late, because I could have made the situation must worse had I persevered.

A woman trims the nails of a white and brown dog while he licks peanut butter from a flexible toy that is suctioned to the wall
This photo is from January 2022, and I still trim his nails this way

A wise friend once described the type of training positive reinforcement trainers aim for as “training that is fun for the learner.” It may sound simplistic, but that description is an elegant and direct description of my goals and beliefs as a trainer. We can say Lewis’ chin rest was positively punished. We can also say Lewis obviously wasn’t having fun. And let me be clear: I want Lewis to be having fun because of the training, not despite it.

Lewis is a challenging dog for husbandry tasks. I am still, after more than a year, clipping his nails while he licks peanut butter. When I dry him with a towel, I must parcel out kibble with one hand while I do it, because otherwise, he descends into overwrought wiggling and mouthing. I plan to train these husbandry tasks instead of managing them with distractions, but I haven’t gotten there.

Lewis’ general discomfort with husbandry was a large part of what prompted me to join Dr. Waite’s study. It is incredibly helpful to me to have a structured approach and her expert counsel. I hope we don’t blow the bell curve!

Copyright 2023 Eileen Anderson

Copyright 2021 Eileen Anderson All Rights Reserved By accessing this site you agree to the Terms of Service.
Terms of Service: You may view and link to this content. You may share it by posting the URL. Scraping and/or copying and pasting content from this site on other sites or publications without written permission is forbidden.