Is it or isn't it punishment?
You're training "leave it." You drop a bit of food, the dog lunges toward it, and you cover it with your foot. Are you just managing the environment, or is this negative punishment, taking away something desired?
The implication is that if it's negative punishment, then a good clicker trainer shouldn't need to use it. And that raises another question, "How can you train ‘leave it' without doing this?" What an interesting discussion!
I have been writing about punishment and about extinction curves in my new book (Reaching the Animal Mind, due out next fall), so I'm happy to address the issue.
All punishers are aversives, but not all aversives are punishers. As I've noted before, Kay Laurence has written amusingly about aversives in the daily lives of her Gordon setters. They fall off the bed, run into door posts, go to the wrong side of a tree and get caught up by the leash accidentally, stub their toes, and so on. Do the dogs learn from these episodes? Not so you'd notice. So, you stepping on the food may just be one of life's aversives for your dog, as if the food had fallen through a crack in the floorboard. Oops. Oh well.
A matter of opinion
However, punishment, like beauty, is in the eye of the beholder. Was it a punisher? It depends. There are two tests. The behaviorist's test is, "Did the behavior subsequently become less frequent or go away?" You may not know the answer without numerous repetitions and, as with many punishers, you may not get the outcome you predicted. After a few experiences, instead of giving up lunging, the dog might lunge even faster, to try to beat your foot to the food.
The ethologist's test, to see if stepping on the food is an aversive or a punisher in the eyes of the dog, is the behavior of the dog. "Did the dog cringe or draw back or, if the dog just hesitated, was the facial expression one of anxiety?" If so, then you know the dog experienced a punisher.
Shaping with variable ratio schedules
Is experiencing punishment an inevitable part of the shaping experience? I don't think so. In shaping, you need to put the organism on a variable ratio schedule so that you can reinforce selectively, choosing one behavior over another. You must build at least a little resistance to extinction, because you don't want the animal to quit every time you fail to reinforce a behavior just once or twice. (That is what was meant by "twofers" way back in the early days—a step in creating tolerance for a shaping schedule. Somehow it got exploded into a fixed ratio schedule as some kind of absolutely necessary maintenance tool, which of course it is not.)
In an organism's very first experience of shaping, even a very small increase in the ratio—a single experience of going from 1:1 to 1:2—can put a naïve learner into an extinction curve. The learner finds it very punishing, too. Think of the fish video where my cichlid faints the first time he fails—when he was sure he would succeed. I've had the same experience with a naïve dolphin (both experiences are included in the new book).
By the time we start training "leave it," the dog is probably way past being horrified at having to try something twice to get paid once; a dog's life is full of variable outcomes and dogs become resilient pretty quickly. So, most of us do our shaping by playing within the bounds of a low ratio variable schedule. We raise criteria slowly enough so that the learner has a good chance of success most of the time.
On Virginia Broitman's newest DVD , there's an interesting sequence on shaping a papillon to stand on a box (the video clip is also in the Karen Pryor Academy Web Lesson on shaping). Both the trainer and the dog are skilled at the game. The dog is on a 4:1 ratio schedule. He tries something new, gets clicked for it a few times, and then Virginia raises the criteria. Oops, the next try doesn't work, so the dog intensifies or varies the behavior. Yay! Success again for the next few tries, and so on. It's not an arbitrary schedule, of course. Virginia is not counting her clicks. She reinforces everything that meets her criteria, raises her criteria quickly, and the dog wins a good bit of the time, but the behavior keeps changing in the direction of the ultimate goal.
This dog is NOT on an extinction curve. Virginia is NOT letting things get to that point. Indeed, unreinforced behavior drops out and reinforced behavior increases, but with the dog's conscious participation. There's no punishment involved; the dog is having fun.
So, IMHO, it is possible to shape, to shift the parameters of the behavior, within the context of variable reinforcement schedules that are tolerable (the Baileys simply call this a "shaping schedule"). This can be done without shifting into deprivation and punishment or putting the animal into an extinction curve.
Tolerance varies—long schedules versus extinction and stress
You can build into your dog or other learner a tolerance for very long schedules, where it takes many tries to find out how to get a click. Once again, there's a related story in my new book. I was in England, giving a seminar for John Fisher's associates, and I asked if someone wanted to demonstrate the box game on stage. A woman came up with her nice Border collie. The audience chose the behavior—going under a chair. After getting clicked once for interacting with the chair, the dog tried a dozen or more different behaviors related to the chair, to the point where the audience was crying out, "Stop it, the poor dog, that's enough." But the dog seemed calm, and the owner knew its capacity; the decision was hers. Next, the dog poked its nose under the chair, got a click, peered under the chair, click, understood the behavior, skipped its treat, scooted under the chair, and came out with all flags flying! The ratio was challenging, but not too hard for that experienced dog. Every single thing it tried that did not get clicked, it abandoned at once. What we saw was NOT extinction, but a decision, a shift in the search pattern, if you like. The actual behavior—looking for that click—stayed very strong.
You can, of course, make the schedule so demanding that your animal does indeed show stress, even though it is making progress. I have read comments by non-trainers saying that they see people shaping with the dog in a state of real anxiety, lips drawn back, tongue out, stress lines in the cheeks, etc. The simple explanation would be that the criteria have been raised much too fast and the animal is, indeed, on an extinction curve, but I don't know. I've never actually witnessed this myself. When I first read an article by another trainer saying that his shaping consisted of "surfing the extinction bursts," I was shocked. Yes, you could shape that way, but ouch, you'd need to build a very hardened dog. It seemed sad to me that someone had figured out a way to weave punishment into the exciting experience of shaping.
A plan to follow
So how do you draw the line? Let's go back to "leave it." If the animal is lunging for dropped food, then you have raised criteria too fast. Start with dropping boring objects, those that would be investigated only to see what they are. Poker chips and dead leaves are good examples. Here you can easily get a smiling "Oh, okay" drawback from the object in a few clicks. (You shape the drawback before you start saying "leave it" of course. Click for looking at it but not moving toward it, for example.) Proceed to slightly interesting objects, and then to toys. Vary your behavior, the location, time of day, indoor/outdoor, etc., build a good reinforcement history for that cue, and then go to boring food (lettuce, carrots), ending up with a recall through scattered bits of steak or whatever your ultimate test item is. At what point do you introduce the cue? I don't know. Maybe when the response is well established with the first objects; that's a shaping decision.
The constructive click
Earning a click is always more powerful than just scarfing up food. I'm sure many of you do the same demo I have often done at ClickerExpo with excitable shelter dogs—clicking the dog for walking at my side, or some other behavior, at a very high rate at first, and tossing the food to the floor sometimes. In five minutes, the dog may be totally focused on making me click, and on the treats that come from the click, though it is wading through dropped hotdogs and bits it missed, without a thought of hovering.
So, if you have to manage the dropped food by stepping on it, it IS just that, a temporary management tool. If you are doing it repeatedly, to prevent the dog from doing the wrong behavior, you are introducing correction into your process. The animal may view it as an accidental mishap (in which case the behavior may persist), or as information: "Oh, leave it, right, I remember now." Or it may be viewed as punishment; the animal's demeanor will help you decide. Punishment has fallout. One kind of fallout is that if you have punishment in your toolkit, you are much more likely to reach for it again (this is a quote from Kay). Punishment is for stopping behavior, not for building it. Be constructive instead.