Random or variable reinforcement is a useful procedure in making a given behavior resistant to extinction, for example in the shaping process, when one wants to raise criteria. To go from reinforcing every response to selectively reinforcing stronger responses, you need to develop enough resistance to extinction so that the animal neither changes the behavior instantly upon going unreinforced once or twice, nor quits altogether. Resistance to extinction is also important in maintaining long duration behaviors, as in searches, field trialing, and so on, and can be developed gradually. Bob and Marian Bailey might consider this simply another example of a shaping schedule.
You have asked if reinforcing on a random schedule would help maintain better performance in utility obedience work. The UD (utility dog) work, however, is not a single long-duration behavior. It is a series of behavior chains. In a behavior chain, the behaviors are linked by KNOWN CUES that are, in themselves, reinforcers, mini-clicks, if you will.
A) they lead to a behavior that is known to earn reinforcement, and
B) they occur (or should occur) during or at the end of the previous behavior, thus reinforcing that behavior.
Each behavior is reinforced by the cue for the next behavior, until you get to the final behavior... (Finish)...and the final, final behavior...(wait at heel in Finish position)...until the judge releases you, when you release your dog and reinforce the whole chain with a big affection display.
To maintain behavior CHAINS you have to be scrupulous about reinforcing the end of the chain continuously, every time you get there. You also have to be scrupulous about maintaining those internal clicks, the cues for the next behavior, as positive reinforcers. How to do that? Be sure to take each unit or pair of units out of the chain and "refresh" their cues value as reinforcers, during practice. Here's where the randomness comes in. Sometimes you could reinforce "Go out, turn, sit" not with a cue for a jump but with a click and a jackpot of food delivered to the dog where he is. Sometimes you might do the directed jump only, placing the dog in the sit and cuing the jump, and clicking and making a big fuss just for the jump, not waiting until the whole Finish, etc. is done. Sometimes you'll do the whole chain and then jackpot with affection and toys, and sometimes you'll do the whole chain and jackpot with a neat treat and quit working. So, you might reinforce a wide variety of behaviors and parts of chains, but to keep the dog working you would make sure you ALWAYS reinforce at the end. Here's where a continuous schedule, not a variable schedule, is needed, to maintain performance.
People can wreck an internal cue in a chain, and thus bring about a deterioration in performance, without even knowing they've done it. Here's one way: tell the dog to do something, then if he makes a mistake, instead of just asking again without clicking, scold or punish the dog for doing it wrong. The cue becomes a punishment opportunity, which takes all the pleasantness away from that cue, even if it sometimes also leads to treats. And the dog's performance dwindles, not just on that behavior but on any previous behavior that used to be reinforced by that once-positive cue.
I suspect from the description that this dog really isn't quite sure what it's supposed to be doing, at all times, in the UD ring. The depressed attitude suggests further that it expects trouble—correction—as soon as the rate of reinforcement drops. Reinforcing even less is not a solution, nor is this a problem suited to randomization, since chains are involved.
I'd suggest:
- Increase your rate of reinforcement in practice, by breaking chains into various pieces and being very imaginative and bold with the treats.
- Make sure that the praise, the reinforcer you CAN give in the ring, is in fact reinforcing to the dog. (Pair it with food and games, out of the ring. If the dog doesn't like petting, and some don't—or doesn't love praise—ditto—at least it can learn that these are conditioned reinforcers. Sometimes a favorite learned behavior can be a ring reinforcer: jumping into your arms, spinning, something. Study the dog.)
- Listen to what the dog is telling you: does it truly understand each cue? Do you withhold reinforcement as a way of rebuking mistakes in what you think the dog "should know"? If the dog is making mistakes, there's something the dog is still confused about. You might want to retrace your training steps to the place where the dog doesn't make mistakes, and then work on clarifying the cue (by giving lots of chances for success).
- Have someone videotape you, in practice, in a practice run-through, and in the ring. Is there ANYTHING you do in practice that you don't do in the ring? Anything you do in the ring that you don't do in practice? For example: encouraging chatter in practice; illegal signals and body English in practice; looking away from the dog, or failing to reinforce at least with eye-contact, in performance; frowns, complaints to lookers-on, signs of disapproval, in practice or performance.
The clearer the information, the more likely the dog will be to do what it's supposed to, and to enjoy doing it, under any circumstance. Laggardly ways during performance suggest a need for more reinforcement, not less, in preparation, and for more clarity, not more ambiguity. Most of all, make sure those cues are clear, trusted, and always good news. Then how much or how little actual physical hotdog appears won't matter.
UD definition
UD = utility dog
What does U.D. mean?
I read your article several times and couldn't figure out what U.D. means. What does U.D. mean? Thank you for responding.
Post new comment