Have you ever had an epiphany? Wherein all of a sudden some information you had been turning over and over in your mind fell into place and created an entire new picture? It has happened to me a handful of times in my life, and in each case the result was that I changed some basic beliefs.
Trainers who have switched to positive-reinforcement based training from more aversive-inclusive methods often refer to that process as “crossing over.” I have written about crossing over in many bits and pieces over the years, plus in a couple of longer articles (I’ve linked some at the bottom of this post).
Lots of us in the dog community read journal articles and scholarly books to learn more about the science behind behavior, even if our academic credentials lie elsewhere. And sooner or later we want to share what we’ve learned, out of the goodness of our hearts (grin), or more likely to try to win an argument persuade someone of our position.
Some say you shouldn’t even cite research if you don’t have credentials in that field. I think that’s true to some extent, but I also think it is beneficial to read and try to assess research even if you don’t have those credentials. Delving into scholarly journals isn’t always easy, but it’s one of the best ways to expand your knowledge and learn about the dialectic nature of science. But you have to keep front and center in your mind that if you are reading about a discipline that you don’t have academic expertise in, you are at a huge disadvantage compared to the people who have a longstanding background in that area.
One of the first rules of citing research is that you must understand the context, both for your own benefit and to save your ass from embarrassment. And if you don’t know much of the context, you’d be well advised to start studying.
Let’s say you run across a quote that refers to some research. It supports a position that might be a little controversial or a minority view, but you are excited since you hold that view yourself. You are delighted and ready to quote it, both to impress your friends and show the other camp a thing or two. What should you do?
As someone whose credentials are in fields other than psychology or animal behavior, here are some guidelines I have developed.
What to Do Before You Quote the Article
Cherry picking is a rhetorical fallacy
Find the original source. If you read about the study in Newsweek or The New Yorker, get the author’s name and track down the original research article. An editorial mention is not peer-reviewed research. You may have to pay for the original piece or order it through a library if you don’t have university access. Another option is to send an email to the author. You’d be surprised how many times they’ll just send it to you. Be sure and thank them politely!
Read the article. The first time, don’t worry too much about all the stuff you don’t understand. Try to forge ahead and get a sense of the whole thing.
Read the article again.
Study the charts and graphics. What are they measuring? What’s on the x-axis and what’s on the y-axis of the charts? What statistical methods did they use?
Now look up the terms you don’t understand. Give yourself a crash course if you need to.
If there are still big sections that you don’t get, consult an expert in the field if you can.
Read the article again. Are you beginning to understand it?
If not, and if you have no way of doing so, stop right there. Don’t bother to quote it. If you think you understand it moderately well, proceed.
Find the quote that got you started in the first place.
Study the part in the article just before it. How was the experiment or problem set up?
Study the part just after it. Did they qualify the statement at all? If so, you are ethically bound to include that part if you plan to quote the study. “The new XYZ method works 95% of the time (YAY!), but only with orphaned voles raised with chipmunks and no other rodents (oh).
Study the results section and the discussion section. These sections are where the authors summarize their results and make the case for their findings. But they are also bound to announce the limitations, and we should be just as attentive to those.
Think hard about applicability. If it is about behavior, are there big behavioral differences between the subject species and the one you want to apply it to? Is one a prey animal and another a predator? Have the researchers done something spectacular in the controlled condition of the lab that can’t possibly be replicated in real life? Or conversely, have they found a problem that rarely shows up in the real world because of the ways that good trainers know how to help animals generalize and practice behaviors? Tread carefully. Think it through. You’ll look silly if you announce a problem that real world experts have been aware of for ages and already know how to avoid.
Find out how many times the article has been cited. Google Scholar will give you a rough idea. If there are few citations it generally means the work made very few ripples in the scientific world (usually a bad sign) unless it is brand new. If it has lots, keep that in mind for # 18.
Start reading the citations. Did they show further research that replicated the results? Or did they yield different results and argue against the first conclusion? Sometimes you can tell from just the abstracts, but sometimes you’ll need to get the full text of those articles too. You may run across a review article of the whole topic. Read it!
Take note of the date of the article. If it was from 1975 and the thread of research continues through 1980, 1983, 1988, and 1992, you’d better read to the end. You’ll either bolster your case or save yourself some embarrassment.
Find a ranking for the journal that published the article. Here’s a journal-ranking site. Collection Development Librarians can also help you assess the comparative merit and ranking of journals and academic publishers. This is another area where you may save yourself some embarrassment. If the ranking is abysmal and the only other publications citing the article are from the same journal–you have a problem. And be careful about the open source “pay and publish” journals; they require even more careful assessment. Some are responsible. Others not so much.
Search through the citations and find the major opponents of the work if there are any. Get the cheerleader out of your head and address the article critically. What do the opponents of the work say? What are the opposing hypotheses and results? Do they make sense? How many citations do they have? (Being heavily cited only shows that people paid attention to the article. A good start. But it might be because a bunch of future studies demolished the findings.)
Take a deep breath. Does your quote have merit? Is it a fair claim, given what else you have learned? Is it from a good source? Has it stood the test of time? Does it apply to your own topic? If so, go for it. Write your post, make your claim, but qualify it appropriately. Cite your source and be careful about Fair Use guidelines: give complete credit so that anybody could go find the very article and quote you are citing, but don’t quote huge chunks.
What does Chance say?
What To Expect Afterwards
Your friends will be proud of you. People who disagree may be irritated or outraged. But here is what to be ready for. There are virtually always people with better knowledge and credentials than you in a given field. If you are already in the hierarchy of academia, you are keenly aware of this.
So, those people may have something to say about what you wrote. Here are the main possible reactions:
They address you with criticism of your piece from the benefit of their broader knowledge. They may ask if you considered Joe Schmoe’s experiment from 2004. They may advise you that you made a beginner’s error and you forgot to account for the “Verporeg Effect.” They may tell you that you really need to start over because of the discrepancy between the metrics being used in the different studies. Make no mistake: This is a GREAT response to get from experts. Even if you personally feel ripped to shreds and devastated, get ahold of yourself. They took you seriously enough to make suggestions. They took time out of their day. Thank them (publicly if their critique was public) and go do as they suggested.
They argue in opposition to your piece. Now you have lots more work to do. They have an advantage. They know the field. They are probably right. But you can make lemonade. Go study their points. You wanted to learn about this, right? Now you have a chance to learn some more. This is still hard on the ego, but again, you got taken at least somewhat seriously, and you have an opportunity to learn. And if/when you find that they are probably right, be gracious.
But the worst: they ignore it. They took a look and decided that gracing it with a response would be a complete waste of their time. So you can either puff up your ego and decide that no one recognizes your genius, or go back on your own and study some more. Maybe you are that lone polymath who has connected the dots between some interdisciplinary stuff and people will recognize your genius later. More likely, you were just out of your depth. The people who make radical, startling discoveries are usually immersed in the field in which they make the discovery, or a closely related one.
But hey. You did your best. You probably learned a lot. Whatever the response to your claim, you must forever be ready to delve more deeply if someone comes up with a well-supported opposing point of view. Be a good sport. That’s how science works.
And by the way: I write from experience. I’ve made a variety of mistakes in citing resources and making claims. I thank the people who kindly helped me improve my understanding and make corrections.
More Training Errors: Cautionary Tales (I seem to have an abundance of these)
Photo credits: Clara with mud on face and Summer “reading,” Eileen Anderson. Cherries, Wikimedia Commons. The circle and slash added by Eileen Anderson.
If your dog is really hungry, what learning process will be involved here?
Yes we can. This question was pretty well answered in the 1950s.
Note: You need a basic understanding of the processes (often called quadrants) of operant learning for the discussion in this post to be meaningful. You can read my post, Operant Learning Illustrated by Examples, to get the basics.
Lately I’ve been reading a lot of remarks that seek to minimize the difference between positive and negative reinforcement. Some people claim that you can’t determine which process is at work in any reinforcement scenario (continuum fallacy, anyone?), and it’s even been argued that the terms “positive” and “negative” should be abolished. More on that in a later post, but if you pick up any new edition of a learning theory textbook, guess what? That point of view may be mentioned but the terms are still in use.
I got curious about the food argument so I gathered up some articles I had and did a little poking around in the psych journals.
To review:
Positive reinforcement: Something is added after a behavior, which results in the behavior happening more often.
Negative reinforcement: Something is removed after a behavior, which results in the behavior happening more often.
Both processes increase behaviors, but most trainers feel there is a big difference between the two methods. Negative reinforcement involves an aversive (the thing that gets removed), and most trainers in the positive reinforcement oriented community want to avoid that. But there seems to be a growing minority that either truly doesn’t perceive a difference or is focused on some gray areas and generalizing from those.
Indeterminate Situations
There are indeed some situations where it is very difficult or impossible to determine the major process. The classic example is that of thermostat adjustment. If you are a little bit cold and go turn your thermostat up 2 degrees and the furnace comes on, are you adding heat or reducing cold? Most people would say the latter (negative reinforcement), but that heat can sure feel pleasurable and luxurious in its own right when it comes on. The situations that don’t fit well into one process or another generally have in common that they deal with a continuum of states and not discrete things that are added or removed.
So yeah, there are some scenarios where the primary process can’t easily be determined.
But food intake is not one of them.
Here’s how the argument is usually presented:
People who train with food are employing a negative reinforcement protocol because food removes the aversive state of hunger.
I have seen this claimed by force trainers but also a few people in the positive reinforcement (oops! maybe they wouldn’t call it that) community. It is often played like some sort of trump card, but it is as out of date as dominance theory.
What I haven’t seen is anybody pointing out is that there were multiple experiments performed in the 1950s that successfully separated out the positive reinforcement qualities of food that were independent of assuaging hunger. So here we go.
The Studies
In 1950 Sheffield and Roby published, “Reward value of a non-nutritive sweet taste.” The abstract from APA PsychNet:
After showing that hungry rats will ingest significantly more of a 1.3 gram/liter saccharine solution than satiated control animals, the hungry rats were trained on a single unit T-maze with a saccharine solution reward. Learning was rapid and showed a high positive relation between correct choice and speed on the one hand and rate of ingestion of the saccharine reward on the other. After discussing the implications of these results for various learning theories, the authors conclude by suggesting “that elicitation of the consummatory response appears to be a more critical primary reinforcing factor in instrumental learning than the drive reduction subsequently achieved.”
“Drive reduction” corresponds to negative reinforcement effects, the satiation of hunger in this case. They removed these effects by using a substance that had no nutritional value. The study showed that while the rats gained no calories, eating a tasty but nutritionally empty substance could be an effective reinforcer.
In 1952 Miller and Kessen published “Reward effects of food via stomach fistula compared with those of food via mouth.” Here is the abstract:
Rats prepared with stomach fistulas were trained in a simple T-maze under hunger motivation and with rewards of milk for correct choices and isotonic saline for incorrect choices. Different groups received milk in a dish or milk injected directly into the stomach. While both groups reduced errors and time significantly, the “milk-by-mouth” group learned more rapidly. “These results show that milk injected directly into the stomach serves as a reward to produce learning, but that milk taken normally by mouth serves as a stronger reward to produce faster learning.”
This says that even for an animal that is food deprived (“under hunger motivation”) there is a reinforcement effect beyond satiation of hunger. What’s left? Positive reinforcement. It’s not controversial to say that eating food is pleasurable. What’s interesting is that this experiment shows that it is pleasurable aside from satiation.
If eating were exclusively about satisfying hunger, most of us would skip dessert–photo credit Wikimedia Commons
Makes Sense to Me
Actually, we know this intuitively. We have taste buds. Preferred foods taste good to us. It’s a major pleasure. How many of us will eat something we really like even after we are full? Some of us could easily do that three times a day!
I don’t know if all animals have taste buds, but most organisms must have a method for discriminating for appropriate foods. Doesn’t it make sense that organisms would have a survival advantage if they got immediate feedback about this, rather than having to wait until they have digested the food?
And it doesn’t take much observation to conclude that dogs enjoy food, does it?
There has been a lot of research on this topic and I have yet to find an experiment that had an opposing result. The same effects have been shown with water consumption and even sexual activity (Sheffield, Wullf, and Backer, 1951), both primary reinforcers. And a personal note: I don’t enjoy reading about these unfortunate experimental animals, whose lives were almost certainly short and painful. But since the knowledge has already been gained from the studies, the least we can do is pay attention to it.
The good stuff: dark meat chicken chunks for agility training
So back to food. Even if your dog is extremely hungry and you expect that negative reinforcement will be involved if you use food to train, the studies say that there is also positive reinforcement even in that extreme situation. Taste buds probably don’t turn off when animals are famished.
There’s another aspect that I haven’t seen mentioned in a study but a smart FaceBook friend mentioned. Putting a treat in your mouth and getting pleasure from tasting and swallowing the food are perfectly and immediately correlated. But there is not a one to one correlation between each piece of food and a perceptible reduction in hunger. At least in humans, satiation is delayed. If you have a perfect meal in front of you with the exact volume of food and number of calories to fill you up, and you divide it up into 30 bites, your hunger is not assuaged 1/30th with every bite. The relationship is just not going to be that salient. The diminishment of hunger is probably delayed, and less linear. The positive reinforcement component would almost have to be stronger, which is what the studies with rats found.
Maximizing Positive Reinforcement
If you are still worried about a negative reinforcement component with training with food, it’s easy to address. To minimize (and possibly eliminate) the presence of negative reinforcement effects, how about this: don’t train your dog on an empty stomach. Meaning the dog’s stomach, silly! But it’d probably be good if you have eaten something too. You want to be at your best when training!
Use appropriate sized pieces of good treats and you can be fairly confident that you are training with virtually entirely positive reinforcement (if the behavior increases, of course). If calories are a concern, cut down the next meal. Also, recent research indicates that dogs, just like people, probably learn better when their stomachs are not empty. Makes me feel good that I have almost always given my dogs some of their meal ahead of time to take the edge off before training.
I’ll be writing more about the movement to eliminate the terms “positive” and “negative” from the technical descriptions of the processes of operant learning, and the nature of reinforcement in general. It’s a pretty interesting topic. In the meantime, you folks who are training with food: you can sleep well at night.
Every so often, in the midst of a discussion about operant learning, someone will write,
The quadrants* don’t matter. Talking about the quadrants just confuses people and makes them pay more attention to theory than what is going on in front of them. To be truly humane we need to pay attention to the individual dogs and how they react to each teaching method.
I really wonder about that.
Of course the dog is the arbiter of what is pleasant and what is aversive and to what degree. But how good are most people at reading dogs, really? How many pictures and videos can you find on the Internet in 30 seconds that look roughly like the one below?
Public Domain Image of Unhappy Dog Being Held Too Close By a Human
[If you are new to the blog or new to the concepts or nomenclature of operant learning, you may just want to skip down to the movie. It is an example of what I will be discussing here: using theory to inform our practice. It is particularly geared towards folks to whom the theory is new. Or if you really want to go for it, here is a whole post, including a video, that gives examples of all the processes of operant learning.]
Setting up “the quadrants” and observation of the dog as exclusive from one another is a false dichotomy. That is a rhetorical fallacy that implies that there are only two choices when there are actually more, or implies that two choices are mutually exclusive. It doesn’t have to be either or, folks, and I will put forth that that attitude can be harmful.
Learning theory and dog body language observation inform each other. Why encourage people to depend on just one and not the other? Why leave a gap in people’s understanding about the processes of learning, certain ones of which have been shown repeatedly in research and real life to have undesirable side effects?
I know that many pet owners who have hired trainers look at them like they have two heads if they start speaking about learning theory. I get that. Clients often just want a method to solve the problem. But when someone is eager to learn and serious about working with their dog, I think it’s a disservice not to share the nuts and bolts of how animals and humans learn. And it’s a disservice to discourage Internet discussions about the processes of learning. Yes, I know they can be tiresome. But just like with any other aspect of humane training, there are always people new to the subject who can benefit.
The more I see people objecting to “the quadrants”, the more I notice that most of them are attempting to veil their use of less humane techniques.
Generalization of a behavior is one of the steps to fluency. One of our ongoing goals with dogs is to help them generalize. So I hope that trainers and teachers would want to help their human students generalize as well. With the humans it’s not only about generalizing behaviors, but about learning concepts and generalizing them as well. Studying the processes of learning and recognizing and naming them helps with this. If negative punishment in one situation stressed my dog out, wouldn’t I want to keep a special eye out for other negative punishment scenarios? Why would I not want that conceptual assistance?
I also know from painful personal experience, and observation of, like, all of YouTube, that reading the body language of a dog and getting past one’s own assumptions is a difficult and time-consuming task. It’s easy for an experienced trainer to say, “Just look at the dog.” But can all students really do that and perceive what the dog is saying? I don’t think my observation skills are below average, but I gotta say, it took me lots less time to get the basics of operant learning processes than it did to learn to read dogs well. I’m still working very hard on that. Being informed by the theory about what kinds of situations to look out for can really make a difference for anyone who is learning to observe and learning the language.
There was a video making semi-viral rounds on the clicker training community recently, often accompanied by comments like, “The power of positive reinforcement!” The video has an adorable, tiny young Yorkie with a bow in her hair doing all sorts of tricks. I saw it posted on a list of thousands of people, and not one person spoke up to discuss the stress signals the dog seemed to be throwing. (Not to mention that some of the tricks might have been physically too demanding for a pup.) Perhaps people were just being polite. I didn’t say anything myself because I had dealt with enough controversy that week, sigh. But the dog did not appear delighted with the training interaction at all.
I’m not linking to the video here, but will send a link privately to any curious folk who make a request through email (sidebar) or a comment.
It sure confirms my doubts about the advisability of just having everybody depend on “watching the dog.”
Examples
The following two stories are true. They both happened to me. One tells how my observation of dog body language led me to analyze and classify the reason for my dog’s stress. After the classification I could be alert to other similarly stressful situations. The other example tells how being informed of what quadrant/process I was using made me question a decision I had made, gain more empathy for my dog, and change my behavior. My (beginner) knowledge of learning processes helped in both cases.
Example #1: From Body Language To Learning and Generalization
My puppy Clara has always loved doing stuff with me and has great attention and a great work ethic. However, I have noticed that shaping can be quite stressful for her. I even wrote a post about shaping and stress. I started thinking about why it might be so. I realized that with my imperfect skills, the changing of criteria was hard on her. Riding the little extinction trails where one version of something ceases to be marked and reinforced and another behavior is desired was quite hard for her.
In the photos, Clara is doing a fast counterclockwise circle, which is a default stress behavior for her. Ironically, the behavior we were working on was, “Relax.” (We’re doing a lot better with that now.)
Clara circling during a shaping session (1)
Clara circling during a shaping session (2)
I have since learned more about shaping and know that if it’s done with careful manipulation of the environment like Skinner suggested and the great trainers can do, there can be much less of this type of stress. I like to think that my skills have improved. Clara has also grown up a little, and doesn’t think the world will end when she doesn’t get clicked.
But my realization that extinction in shaping was hard on Clara made me both more empathetic to her situation and also proactive in avoiding extinction in other scenarios. Her stressed body language made me analyze the cause of the stress, and being able to put a term to it allowed me to learn more about it and look for similar problems in other situations.
(Extinction is not one of the four operant learning processes that people call the quadrants. Extinction is when a behavior that has been previously reinforced ceases to get reinforcement. It is a process that can happen with both operantly learned and classically conditioned behavior. What’s important for my point is that it is a learning method that is often under my control and that I can choose whether or not to employ, and one that can definitely be stressful.)
Example #2: From Quadrants to Empathy
Early in my life with my dog Zani I picked her up and carried her into a public place. She is very friendly and immediately started to struggle to get me to put her down. Since I was just learning about reinforcement, and had learned that what you reinforce is what you get, I decided to hold onto her until she stopped. I didn’t want to positively reinforce her struggling by giving her immediately what she wanted. She finally stopped, I waited a few seconds, then put her down.
I mentioned this episode to my teacher, who said, ah yes, you used negative reinforcement instead! Up until that moment it had not trickled into my head that I had been using a mild aversive. Zani did not want to be held. She was struggling to get away. I not only hung onto her but I had tightened my grip until she figured out that struggling wouldn’t work. (There could be an element of positive punishment in here as well. But the duration of the tight grip, and the requirement for Zani to come up with a different behavior to escape it, even if the behavior was relaxing her body–these indicate that the major process was that of negative reinforcement.)
I grew up spending a lot of time in the country and was around a fair number of small animals and farm animals. Holding or holding down a struggling animal with force was just something you took for granted. You had to do it sometimes “for their own good” and it was something I was absolutely comfortable with. I was 50 years old before I realized that there are things you can do to help prevent you and your animal from getting into this situation in the first place, and ways you can give them more of a say about things. And it was in part because my teacher reminded me of the different processes of operant learning. This led to empathy for Zani on my part, and for me not only to work on that specific situation but to be more aware of the negative reinforcement moments in the future.
Education about Learning Theory
Here is an example of the kind of thing that I believe can be helpful. This is a video I made that demonstrates what negative reinforcement can look like, and shows the same behavior trained with negative reinforcement vs positive reinforcement. It is a modest attempt at linking the theory, practice, and dog body language.
I’d be interested to know what the rest of you think. Can we train humanely without knowing learning theory? For me, the theory definitely helps.
Four quadrants of operant conditioning
*NOTE “The quadrants” is not optimal nomenclature in learning theory. I use the term throughout this piece because that is how the argument is almost always stated, and people might not know what I was talking about otherwise. Better nomenclature is “the processes of operant learning.” “The quadrants” is just a description of the shape of the diagram they fit in, as Dr. Susan Friedman points out.
Does Summer’s weave behavior get reinforced by a spray of water?
In the terminology of behavior science, positive does not always mean good. Actually it never means good. Likewise, negative does not mean bad. Also, reinforcement is not always about giving the dog something she wants. And punishment is not always about hurting, intimidating or confining her.