Does the Carrot Need the Stick


A dissertation in support of a BA.Hons degree in psychology with biology. Awarded by the University of Luton, 1997

Does the Carrot Need the Stick? Are aversive stimuli an obligatory component in the training and maintaining of behaviours in animals?

John D Dineley 


INTRODUCTION

Historical Overview

THE PRINCIPLES OF OPERANT CONDITIONING

PUNISHMENT

Punishment in the social world of animals 

Punishment as a tool in behaviour modification 
Time-out 
Food deprivation

POSITIVE REINFORCEMENT

Positive reinforcement as an alternative to punishment

THE ETHICS OF ANIMAL TRAINING

DISCUSSION

REFERENCES
 



INTRODUCTION

"We are astonished if we are offered a carrot that is not backed up by a stick" (Murray Sidman, 1989, p. 1)

Humans have interacted with animals and modified their behaviour over many thousands of years. Many species have been domesticated and used for food, clothing, labour, research, education, companionship and entertainment (Soulsby, 1986). Likewise, wild animals have also been maintained under-human care for many of the above-cited reasons. Fernando (1990) cites that the Indian elephants (Elephas maximus) were being captured and trained as early as 7 BC. These human-animal relationships have always required some degree of training or behaviour modification of the animals concerned.

Contemporaneously, the issue of animal training and the technique used (particularly with wild, captive animals) has become controversial due to the growth of the concern for animal welfare and the ideology of animal rights. Claims continue to be made that training must involve "cruelty" in the training of circus animals (RSPCA, 1990). In the exhibit and handling of animals within zoos and public aquaria, McKenna (1992) cites allegations by former animal trainers of the use of aversive methods such as food deprivation and social isolation with animals such as dolphins to motivate compliance. 

This dissertation is to explore whether such aversive methods, such as punishment and deprivation, need be applied and what consequences could result in their application. Moreover, are aversive stimuli an obligatory component in the training and maintaining of behaviours in animals, particularly marine mammals such as dolphins? In short, does the carrot need the stick?

Historical Overview 

Until recent times, theories of animal training where not evaluated at a scientific level. However, with the development of the behaviourist school of experimental psychology, the protocols that are needed to shape and maintain trained behaviours have become well defined. Many trained in psychology are now actively involved in developing and supervising various programmes involving animal behaviour modification. This is particularly true in the management and research of marine mammals in public aquaria.

The development of the psychological school of behaviourism can be traced back to the early work of the Russian physiologist Ivan Petrovich Pavlov (1849-1936). Whilst engaged on experiments to do with the physiology of digestion, Pavlov discovered that the experimental dogs he had been using became conditioned to salivate to a neutral stimulus of a bell that had been coincidentally paired with food presentation to the animals. Moreover, the dogs appeared to have become conditioned to salivate when they heard this sound in isolation and regardless of the presence of the food. What Pavlov had believed to be an autonomic response had been shown able to have been elicited by a simple pairing of food to a neutral response, the sound of a bell. He further discovered that the stimulus that had been conditioned to produce salivation could be generalised to other stimuli. Moreover, that the nearer the context of the new stimulus was to the original conditioned stimulus the greater the conditioned response, for example salivation. Pavlov referred to these phenomena as a conditioned reflex due to the effects this learning had on an autonomic process of the dogs (Pavlov, 1927).

John B. Watson (1876-1958) is generally recognised as the founder of the behaviourist school of psychology. Watson led a move away from the introspective techniques developed by workers such as Freud towards the examination of observable data and modelled the methodology of investigation on the sciences of physics and biology. Further, he considered that such psychological investigations of internal mental events were related more to philosophy than psychology and where nothing more than speculation. He believed that only by examining measurable events could one understand behaviour and those unobservable mental events could not be objectively assessed. His research into learning led to him formulates a Law of Exercise, he suggested that an association between a stimulus and a response became fixed due to repetition (Toates and Slack, 1990).

Connected to the above work of Pavlov and Watson' were the studies by Edward Thorndike (1874-1949) on trial-and-error learning in cats and his formulation of the Law of Effect. Thorndike placed cats in puzzle boxes that where constructed to release the cats when they chanced upon pulling a string placed within their reach outside the box. Thorndike noted that when cats where subjected to series of trials within the box the amount of time required on each successive time to pull the string and escape reduced. Plotted on a graph this revealed what has become known as a learning curve. Thorndike proposed in his Law of Effect that an action that results in a pleasurable experience is more likely to be repeated by an organism - in Thorndike's case his cats could escape the confines of his experimental box (Toates and Slack, 1990).

B. F. Skinner (1904-1996) expanded upon the work on animal learning, such as Thorndike's. Skinner was interested in trying to understand different aspects of learnt behaviour and how established how they were maintained. He undertook a considerable amount of experimental work, mainly with rats and pigeons, and designed an apparatus called The Skinner box. This allowed accurate measures of stimulus, reward delivery and subject performance. Behaviour within the box could be monitored by instrumentation such a graph plotter. He termed the interaction between his animals' subjects and the environment operant conditioning. The term is derived from the aspects of the behaviour observed: animals were operating environmental situations under the control of various types and schedules of reinforcement (Skinner, 1938). Kazdin (1989) has described the process as:

"A type of learning in which behaviours are altered primarily by regulating the consequences that follow them. The frequency of operant behaviours is altered by the consequences that they produce" (Kazdin, 1989, p. 343). 

Skinner experimented with different methods of reward delivery, schedules of reinforcement, and how this affected the behaviour of animals under various conditions. Skinner was a prolific writer and extended his laboratory research into animal behaviour and learning into developing book based teaching programmes, which included study guides to his own areas of research (Holland and Skinner, 1961).

Moreover, his theories of learning and behaviour were promoted as being the answer to the relieving of various problematic social issues (1972). Skinner even wrote a fictional novel called Walden 2 (Skinner, 1961) to illustrate how operant conditioning techniques could facilitate a Utopian society of the future. He further suggested that the operant conditioning approach to learning was responsible for human language acquisition (Skinner, 1957); this was an area that became the focus of considerable controversy, particularly from linguists such as Chomsky (1959).

Nevertheless, Skinner has remained a leading figure in the history of psychology. He is the only psychologist to date to have had a complete edition of the American Psychologist devoted to his life and research (Dinsmoor, 1992). His techniques of behaviour modification continue to be used in areas from clinical psychology to animal training; the area of interested in this dissertation.

The use of Skinner's operant techniques in behaviour analysis and modification in animals, for reasons other than laboratory research, can be credited to two of Skinner's students Keller and Marian Breland (Breland and Breland, 1966). They were involved in the applied application of animal training and eventually formed a company called Animal Behaviour Enterprises (ABE). They trained animals and staff for various situations. This included some of the seminal developmental work of training protocols for dolphins for the US Navy. This included the training of animals for open ocean research where animals were trained to be released into the open ocean environment, co-operate in experiments and return to captive care (Bailey, 1965). 

Breland and Breland's work were expanded by the opening of the large public aquariums with their trained dolphin displays which incorporated staff and operant techniques developed by ABE (Bailey and Bailey, 1997). These initial successes led zoos and public aquaria to actively seek out and employ graduates from the psychological disciplines. These individuals continued to apply behaviourist techniques in these facilities in areas of public display, husbandry and research (Turner, 1964; Defran and Pryor, 1980). 

Operant conditioning was applied to the training of many animal species but was found particularly useful when working with animals such as dolphins. Pryor (1975), a leading promoter of positive reinforcement in animal training, pointed out that whilst terrestrial animals may be trained using negative reinforcement, restraint and coercion this is not possible with a dolphin. These animals can move away from a trainer into the open water of their pool. Therefore, adopting operant techniques, including a reliance on positive reinforcement, was developed as a method of best choice with these animals. 

From these pragmatic beginnings, techniques free of aversive negative reinforcement and punishments have been promoted, extended and developed and are now being adopted for training not only dolphins and whales (cetaceans) but other wild and domestic terrestrial animals and birds (Pryor, 1981). These techniques have not only been used for public displays but for environmental enrichment and husbandry procedures such as routine veterinary examinations.

THE PRINCIPLES OF OPERANT CONDITIONING

It is important to understand the basic principles of Skinner's operant conditioning and its terminology when assessing whether aversive stimuli (for example, punishment) is an obligatory component in the training and maintaining of behaviours in animals.

In Pavlov's experiments with classical conditioning, cited above, he had presented food, an unconditioned stimulus (UCS), and paired this with a neutral stimulus, e.g., a bell. The conditioning process resulted in the bell becoming a conditioned stimulus (CS). In operant conditioning, the situation and terminology is somewhat different; the key point is that in classical conditioning the animal has no control over the conditioning process, the bell and food appear whether the animal responds or not. In operant conditioning, the reinforcement (e.g. food) is only presented when the animal responds: the animal presenting the correct response is the component for receiving a reward (Greene and Hicks, 1984). The animal operates on its environment to receive reinforcement.

Reinforcement can be defined in operant terms as the increase in the frequency of a response. It can be positive such as a food, or negative, such as an electric shock. Punishment is not negative reinforcement as it is used to decrease the frequency of response. 

Sidman (1989) considered that the manipulations of negative and positive reinforcers can act as punishers. The removal of a positive reinforcer is a punisher that can cause a decrease in target behaviour. The application of a negative reinforcer, such as an electric shock, can be viewed as a punisher if no actions are available to the individual to end the experience, hence the decrease of the target behaviour. If the shock were to be used as a negative reinforcement, its application would be designed to give an opportunity for an increase a target behaviour, e.g. escape from the situation. Moreover, Scarpuzzi, Lacinak, Turner, Tompkins and Force (1991) suggest a clear distinction between these two types of punisher as being negative and positive. A negative punisher being the removal of a positive event after a response and a positive punisher being an aversive event being presented after a response.

Primary reinforcers are elements that do not require conditioning such as food or water. Secondary reinforcers are reinforces that have gained this state through conditioning by pairing with a primary reinforcer. The use of secondary reinforcers, in the form of tones or whistles, has been used extensively in dolphin training as an aid to shaping behaviours when animals are being trained to executing behaviours at a distance.

How does this basically work in practice? An example of a rat being conditioned to press a lever for a food reward in a Skinner's box will be used by way of an example. In this instance, food can be termed as an unconditioned reinforcer.

When the rat is placed in the Skinner box a desired goal may be to condition the rat to press a lever for a food reward. As a matter of chance the rat may well push the lever and be rewarded with a food pellet thus setting in motion the conditioning process. However, this behaviour can also be shaped by successive approximation. This can be achieved by an operator rewarding the rat for first approaching the lever, then touching the lever. Each step being a successive continuum to the desired goal, pushing the lever, hence the term successive approximation, each step is approximately nearer the target behaviour being trained. Eventually this will lead the rat to press the lever for a food reward. It can then be said that the rat has learned to operate on a component of its immediate environment for (positive reinforcement) reward. 

A contingent that employs negative reinforcement would use, for example, a small electric shock. This would be presented when lever pushing is not undertaken; in both instances, the target behaviour, lever pushing, increases. Punishment can be used to stop trained lever pushing by the presentation of an electric shock on pushing the lever; this would have the net result in decreasing this behaviour, the lever pushing would decrease, the behaviour would under-go extinction. This can also be undertaken by the consistent non-reinforcement of a previously reinforced behaviour that will also stop being performed; the behaviour will undergo extinction due to non-reward.

As cited above, Skinner also experimented with the frequency of reinforcement when a correct action was made following fixed or various modes. A reward after every correct response was termed fixed mode of reinforcement. Various schedules of reinforcement, where a reward for a correct response may be placed on a random or varied allocation schedule. This has been shown experimentally to increase motivation beyond that found in a fixed schedule of reinforcement (Martin and Pear, 1996). Pryor (1985) suggests this may be a component in the desire for people to continue to play on coin-operated gaming machines that clearly work on a random or varied allocation schedule of reinforcement.

Another important applied component involved in the training of animals, derived from Skinner's initial work, was that of the conditioning and presentation of a secondary reinforcer sometimes referred to as a "bridging stimulus". It was the work of the Breland's that developed this technique in the applied setting (Bailey and Bailey, 1997). The bridging stimulus was a conditioned secondary reinforcer that preceded a non-conditioned or primary reinforcement during the training processes or to maintain a trained behaviour. In the case of a dolphin, the animal would be presented with a stimulus, generally a whistle, which was paired and conditioned with a primary reinforcer, such a food reward. After the conditioning process was complete an animals behaviour, that was being trained or shaped, could be precisely reinforced by a stimulus from the whistle; this cue acted as an immediate discriminating indicator of the desired response. The animal would then understand that it had undertaken the desired response and would then return for its non-condition primary reinforcer such as a food reward. This technique allowed for very fine discrimination of tasks that are required in the training and shaping process by the animal These are particularly helpful when behaviours have to be reinforced at a distance from the trainer.

One other important aspect of the training and shaping process was the need to bring trained behaviours under some form of stimulus control; this involves the training of a specific environmental cue that will elicit a trained behaviour. The discriminative stimulus, or SD as it is sometimes termed, is an event or stimulus that signals a certain behaviour will be reinforced. These cues can be anything from a hand gesture to an underwater tone of a specific frequency. Behaviours are brought under stimulus control by the presentation of a chosen specific signal during or just after the desired behaviour. After a number of trials, the SD becomes conditioned and its presentation will elicit the desired behaviour.

PUNISHMENT 

Punishment in the social world of animals

When examining aspects of behaviour control and modification it is perhaps useful to review information derived from research on animals' behavioural interactions in nature. As regards the uses of punishment for behavioural control, Clutton-Brock and Parker (1995) undertook a review of the current theories of punishing behaviour modification in animal societies. They point out that while much work has been undertaken to study reciprocal altruism and aggression in these groups, attention has not been attracted to study negative reciprocity, e.g. punishment. They suggest that these activities evolved along side behaviours such as reciprocal altruism as it increases the fitness; fitness as defined in evolutionary biology as lifetime reproductive success relative to other members of the same populations. 

In these social interactions of animal societies, they list the following contexts where punishing behaviour is undertaken: the establishment and maintenance of dominate relationships; theft, parasitism and predation; the establishment of mating bonds; parent/offspring conflict; and the enforcement of co-operative behaviour.

One example of the above-cited behaviour is that involving the establishing and maintaining of dominance within a social group. This is prevalent in social primates such as chimpanzees (Pan troglodytes). Here social hierarchies are dynamic and can be subject to periodic challenges to dominant or alpha males by subordinates. The groups' social dynamics can be affected by attacks not only directly to the alpha males but these particular individual's coalition members. Clutton-Brock and Parker cite de Waal's (1989) research that demonstrated how a young male chimpanzee challenged and won dominance of a group by regularly threatening and undertaking punishing attacks on his rivals female supports. Ultimately this led to these females transferring support to the challenger. On achieving his goal of dominance, the new alpha male discontinued punishing, aggressive behaviour towards his new female supporters.

Unfortunately, it can be seen from this evidence that behaviour modification techniques used within the social dynamics of wild animal groups would have very limited value when developing appropriate strategies for behaviour in the applied setting of animals being trained in captive care. Here an important issue is the relationship between the handler and animal, this is a very different relationship than many found between the wild animal and its conspecifics.

Punishment as a tool in behaviour modification

Kazdin (1989) points out that punishment can often result in undesirable side effects. One key effect is escape or avoidance behaviour with the individual acting to avoid the aversive stimuli. If avoidance or escape is achieved, such behaviours are maintained due to negative reinforcement; the successful escapee removes the punishing agent and hence is rewarded by its removal.

Moreover, a draw back with aversive stimuli, such as physical punishment, is that it can lead to aggression. Hunchinson (1977) demonstrated that aggression could be a by-product of aversive control. He observed punished animals direct aggressive actions at other proximate animals or the punishing agent: which, in most cases, is the individual administering the punishment.

Physical punishment can also have the undesired effect of negatively reinforcing the individual who initiates the punishment by apparently resolving a perceived problem thus leading to the perpetuation of such behaviours. It has been suggested by workers such as Timberlake (cited in Kazdin, 1989) that in human society parents that used harsh punishment on their children may also increase the possibility of their children using similar tactic themselves to resolve problems.

Turner and Tompkins (1990) point out that punishment can cause an animal to adopt behaviours that will disguise precursor behaviours that signal a possible aggressive interaction is about to take place. Such precursor behaviours play an important role in animal societies, as they will give an indication of intentions to take aggressive actions against a targeted individual. In many instances, these precursor behaviours may result in animals disengaging from an actual attack on each other. 

These phenomena can be considered a product of animal adaptation and based on the concept of animals assessing the cost and benefit of such actions. In the instance of aggression, an animal that is about to undertake an aggressive attack must assess that whilst its actions may lead to the resolving of a perceived problem, for example, escape, it may also end with the attacking animal sustaining an injury that may become life-threatening. 

Many conflicts in the natural world use stepped rituals in aggressive interaction that precedes actual attack. This has been studied in a number of species such as the sexual conflicts of red deer during the breeding season or rut (Clutton-Brock, Albon, Gibson and Guinness, 1979). These authors noted that male deer would go through a series of precursor behaviours such as roaring, parallel walking with an opponent, before ultimately attacking and sparring with antlers. Further, a number of attacks may not transpire and will be terminated by one of the animals during the earlier above cited displays of roaring or parallel walking. Such behaviours could be considered important in assessing the cost and benefit of continuing certain behaviour and the risks of personal injury if actions are allowed to continue to actual conflict. 

In the specific use of physical punishment and negative reinforcement in animals such as dolphins, Pryor (1975) maintains that such actions are at even a basic practicable level unworkable. Caldwell and Caldwell (1972) state that positive reinforcement (using food reward) appears to be the only method used to successfully train dolphins. In the only case they are ever aware of a dolphin being heavily punished (actually beaten) the animal became very aggressive. Whilst they point out that these animals must be subject to negative reinforcement in the wild there is some evidence that when applied in a training situation this could lead to neurosis.

Time-out

As cited above, Pryor (1975|) states that aversive methods of training are virtually unavailable when training animals such as dolphins with physical restraint or forms of corporal punishment to animals being not only inappropriate on ethical but also practicable grounds. Quite simply, unlike many other animals in a training situation, dolphins and whales have the ability to move away into the middle of their pools out of reach of the trainer. 

Therefore, if any form of punishment is applied, the most widely used method of punishment applied to marine mammals is a method called time-out. The principle involved in time-out is the removal of positive reinforcement. In the case of animal training, this may involve the trainer removing themselves from the direct association with the animal. Time-out is one of the most useful non-aversive tools that can be used as a form of punishment and it's wider application in other animal species, including humans, has been successfully applied (Martin and Pear, 1996). Kazdin (1995) refers to it as time-out from reinforcement, the removal of a positive reinforcement for a period of time.

Nonetheless, Sidman (1989) notes a word of caution in the application of even this punishment, particularly, when time-out involves an individual being psychically removed from the positive reinforcing environment to a supposed non-reinforcing environment. He observes that the attention given during the move to the new non-reinforcing environment or the new environment itself may be equally or more reinforcing. This will result in the time-out become a positive reinforcer. Further, McKenna (1992) cites the allegations of a former animal trainer of the apparent consistent misuse of time-out by the isolating of uncooperative animals (dolphins) in holding-pens as a form of punishment.

Scarpuzzi et al. (1991) details a technique developed at the Sea World Parks in the United States entitled LRS (Lease Reinforcing Stimulus). The authors developed this method of training due to their firm belief that any form of punishment will result in the above-cited dangers and drawbacks such as avoidance and escape behaviours, frustration and aggression in the animals being trained. 

The technique of LRS involves presenting an animal that performs an incorrect behaviour with a passive, calm two to three-second pause at the point the trainer would normally deliver a primary or secondary reinforcement. The methodology was formulated to develop the extinction of undesired behaviour without recourse to either negative reinforcement or punishment and their resulting side effects. 

Food deprivation.

McKenna (1992) cites allegations by two former animal trainers that the use of food deprivation was a common form of punishment for the non-compliance in trained dolphins in marine parks and aquaria. 

However, other published reports and accounts on this subject do not seem to support these allegations. Breland and Breland (1966) state that:

"One of the most significant traits that the porpoise shares with the cow in terms of training potential concerns the relation between food deprivation and vigour or intensity of behaviour. It has been repeatedly observed at such aquaria as Marineland of Florida and Marineland of the Pacific, where extremely vigorous or intense behaviour is sometimes required of the porpoise - such as a 16-foot-high jump out of the water - that marked increases in food deprivation do not markedly increase the vigour or intensity of the response" (Breland and Breland, 1966, p. 81).

Wood (1973) maintained that reducing or withholding food from dolphins only had a limited effect on their response to trained commands, and Pryor (1975) points out that food deprivation is unnecessary when training dolphins. 

Specific research into the use of food deprivation and its effects on motivation to perform has been undertaken by Nachtigall (1976) using three bottlenose dolphins (Tursiops truncatus). The research studied the amount of food consumed and the rate of response to a trained task when various food deprivation schedules were applied to these three animals. The animals being studied were trained to press a paddle for a food reward. 

It was found that food deprivation did not effect the level of performance of the experimental animals paddle pressing following food deprivation of one, two, and four days of duration. Further experimentation did, however, reveal that paddle press response rates increased in frequency by a drop in the animal's body weight. Nachtigall concluded that the hunger drive in a dolphin could only be successfully manipulated by changes in body weight.

It should be further noted that the use of food as a (primary) reinforcer is not an obligatory tool in the training and behaviour modification process. Pryor (1973) observed that reinforcement other then food can be used and cites the use of access to favourite toys and stroking as alternatives. 

The use of non-food reinforcement in the training of dolphins and other marine mammals has been further developed by the introduction of a schedule of reinforcement named: Random and Interrupted Reinforcement (RIR) (Brill, 1981). One of the components of this technique allows the use of additional non-food reinforcements. Van der Toorn (1987) has pointed out that one of the obvious advantages of a training protocol that involves the use of variable reinforcement, such as tactile stimulation, vocalisations, play or even trained behaviours is particularly useful when wishing to examine animals that maybe unwell and are not subject to food reward due to illness-induced anorexic behaviour.

POSITIVE REINFORCEMENT

Positive reinforcement as an alternative to punishment

The use of reward or positive reinforcement has been habitually used in animal training and general behaviour modification many years. Kiley-Worthington, (1990) noted in her study of the behaviour and welfare of circus animals that even "traditional" methods of animal training and behaviour shaping relied heavily on components that involved positive reinforcement in taming and training the animals. She cites the use of food rewards in all aspects of the taming and training process of animals as diverse at lions and elephants.

Nonetheless, the introduction and application of the laboratory derived research into learning using operant conditioning has its contemporary origins in the work of the company formed in the United States in the nineteen-fifties called Animal Behaviour Enterprises. As cited above, this company was developed by two psychology graduates and former students of B.F. Skinner, Keller and Marian Breland. 

Breland and Breland began to apply training technique derived from behaviour and learning research undertaken by Skinner and others in the laboratory with animals such as rats and pigeons. They where also keen to promote the concept of behaviour modification by positive reinforcement and not punishment. Their company had considerable success and they claimed that this was the result of being able to train feats that traditionally coercive training methods could not find possible. They had a client base that extended through animal training for advertising and entertainment to the providing of staff and animals for zoological displays and behaviour research (Breland and Breland, 1954). By the early nineteen-sixties, they had trained over 6000 individual animals from thirty-eight animal species. 

The Breland's involvement with the US Navy in training dolphins was one of the avenues that led to the development of the use of operant conditioning and positive reinforcement techniques finding particular applications in the training of cetaceans in both public zoological exhibits and research. Norris reviews the history of the research and display of these animals and their involvement as zoo exhibits and research subjects in the area as diverse as bioacoustics and cognition (Norris, 1991).

The most recent published promoter of the uses of positive reinforcement for the training of animals is the zoologist Karen Pryor. With her former spouse, Tap Pryor, she founded the pubic aquarium Sea Life Park in Hawaii in 1963. The park was set up as not only a place of public recreation and entertainment but also a scientific resource. Pryor became the park's first director of animal training that involved the behaviour modification of various species of dolphin and whale. The ethos for the development of training protocol using Skinnerian operant conditioning came from an initial training manual by psychology graduate Ron Tuner. Turner had been involved in training dolphins co-operation in research with biologist Dr Ken Norris (Pryor, 1975; Norris, 1974).

Both Breland and Breland (1954, 1966) and Pryor (1975, 1985) have published basic protocols for training using positive reinforcement for shaping and maintaining behaviour. They have also promoted the use of positive reinforcement to modify misbehaviour, rather than the use of punishment. Pryor (1995) cites the technique of training an incompatible behaviour that competes with the target misbehaviour. One example she mentions is that of problem behaviour shown by a dolphin that began to hassle a female swimmer during a public demonstration in a marine park. The problem was solved by asking the animal to push a lever for reward whilst the swimmer was in the water. The animal could not undertake both behaviours at the same time and found the lever pressing more reinforcing and the harassment ceased. 

Kazdin (1989) cites variations of the above techniques that have been successfully employed: Differential Reinforcement of Incompatible behaviour (DRI), the reinforcing of a competing behaviour with the unwanted behaviour as detailed above by Pryor (1995); and Differential Reinforcement of Other behaviour (DRO), the reinforcing of any response except the unwanted behaviour. In both cases, the unwanted behaviours, because they are no longer reinforce, will cease over time undergoing the process labelled extinction. 

Finally, the use of positive reinforcement in behaviour shaping and training in has led to some very interesting work regarding the generating of creative and spontaneous behaviours in dolphins. Pryor, Haag and O'Reilly (1969) training two tough-tooth dolphins (Steno bredanensis) to emit novel responses that were not developed by standard shaping protocols. The animals were reinforced for producing a different novel response to the same-trained cue or discriminative stimulus (SD) in a series of training sessions. The animals further demonstrated behaviours that would not have previously been seen in this species.

THE ETHICS OF ANIMAL TRAINING.

The use of animals by humans has been an area of some controversy. The training of animals, placing them under control, is an anathema to many people, particularly on the grounds that the uses of some training methods are perceived to compromise the welfare of the animal.
Kiley-Worthington (1990) explored this area of the training and handling of animals whilst commissioned to research the welfare of animals trained in circuses and zoos. She believed there are many positive aspects of the relationship that can be formed between empathic trainers and their animals. Even in the traditional environment of the circus, she found that many trainers had a clear understanding of the important role of positive reinforcement in eliciting co-operation from their animals. She also noted that many also realised the limited role of negative reinforcement and the use of dominance in training with its inherent dangers to both animals and trainer, she states:

"...there is no reason why circus training, any more than any other animal training, of its nature causes suffering and distress to the animals, or should be considered ethically unacceptable" (Kiley-Worthington, 1990, p. 142).

One of Kiley-Worthington's most interesting arguments for the training of animals is the role of training as occupational therapy for animals in captivity. The philosophy behind this argument is that many animals are restricted in captive environments from undertaking a number of behaviours such as foraging for food, which could involve hunting in the case of predators. Training could be helpful in not only physically but intellectually stimulating animals.  

This is not necessarily a new idea and has been proposed by workers involved with zoo animal biology and welfare, such as Hediger (1959), since the early nineteen-fifties. The concept has been further developed by the growth of the concept of developing exhibits in zoos and protocols in farms and laboratories that involve "environmental enrichment" (Monaghan and Wood-Gush, 1990).

Environmental enrichment design can range from providing orang-utans with puzzle feeders, that require the use of tools (length of bamboo) to remove food items for puzzle boxes (Seymour and Shepherdson, 1990); to the training of operant techniques to farm animals to adjust the immediate environment, such as pigs choosing between access to concrete or earthen floors (van Rooijen, 1983). Giving animals choices in the selection of their preferred environment is one of the important criteria used by Dawkins (1980) in researching ways of evaluating animal welfare. 

Laule (1992) specifically addresses the issue of training as a tool in environmental enrichment. She considers that it is important to the psychological well-being of animals not only in the situation of the public trained display, where it will increase the animals overall activity level and provide daily diversity, but also as a benefit in the training of behaviour to assist in husbandry and veterinary procedures; such as blood testing and other intrusive procedures that otherwise may involve the risks of physical restraint or anaesthesia. Further, it has merits in offering the chance of effective therapeutic intervention for animals that have developed problematic, aggressive, social, neurotic and stereotypic behaviours, which should not be ignored. 

Nonetheless, some critics may still maintain that training methods are themselves unnatural and for this reason still remain ethically unacceptable. However, it should be recalled that the techniques developed to train animals, using operant conditioning, were derived from investigations into learning and how natural behaviour is shaped. This research was an effort to describe and codify processes that were already taking place within society and the natural world. Therefore, the use of operant conditioning as a learning process, is the application of behaviour modification techniques that are consistently use in nature when animals interact with their wild environments. Further, Sidman (1989) points out that the natural world is very coercive and employs considerable forces to motivate survival behaviour in animals via positive and, in many cases, negative reinforcement and punishment; an animal that can not adapt to such measures will not survive. Animals maintained in captive environments, particularly those that apply non-coercive operant techniques cited in this dissertation, are exposed to such low levels of stress and discomfort that there is no real comparison to the rigors they would have to bear in the natural environment. 

DISCUSSION

As it has been mentioned, humans have been involved in the modifying of the behaviour of animals for centuries. In recent years, the use of animals by humans has become more controversial due to concerns for animal welfare and growth of animal-rights ideologies. The use of wild animals has become particularly an area of debate, training animals, modifying their behaviour, has also been of concern to a number of groups and individuals concerned with animals welfare. It has also been commonly asserted that some forms of aversive coercion, such as punishment or negative reinforcement, have had to have been used to "make" animals comply with their trainer's wishes. 

Criticism is particularly high when training methodologies are applied to animals in entertainment in circuses or public training displays in zoos and marine parks. Ironically, it was the development of the marine aquaria with displays of trained marine mammals, particularly the dolphins, and the employment of psychologist skilled in behaviour analysis, that developed training protocols that were for the first time actually based on laboratory observation of learning and behaviour modification. With these developments, it was now possible to evaluate whether aversive stimulus, such as punishment, is an obligatory component in the training and maintaining of behaviours in animals.

There is, of course, no doubt that aversive stimulus, such as punishment and negative reinforcement, can be seen to have been employed with various degrees of success in the wild. The analysis of the various functions of punishment in wild animal societies, as reviewed by Clutton-Brock and Parker (1995), has been presented, and it is clear that animals do use various forms of aversive and punishing behaviours within their social lives. 

However, although it is often maintained that animals that are in the care of humans, particularly wild animals, should be provided with an environment that meets both their psychological and physiological needs of their wild conspecifics (Hediger, 1970). The application of such naturally occurring aversive strategies in the behaviour modification of animals in a training situation of a zoo or marine park would appear to be unwise and inappropriate.  

If one is to examine the above-cited review of wild, social punishment by Clutton-Brock and Parker (1995) it should be noted that these are aggressive, punishment behaviours and are directed from dominate animals to subordinates. These behaviours, in turn, have set in motion a dominance hierarchy. In the natural world this is a dynamic situation. Over time, this will involve the threat of physical displacement or attacks against any current dominant animal by direct means or by undermining their supporters. This will eventually result in the current dominant animals position being taken from them that can eventually involve punishing aggression and violence (Manning and Stamp-Dawkins, 1992). Such a precarious and transient situation would clearly not be the basis of a long-term empathic relationship that is required in a training situation. As pointed out by Turner and Tompkins (1990) the application of punishing contingencies by a trainer may cause the suppression of behavioural cues that an animal would normally use to notify intentions that it is about to initiate an attack. In short, the use of punishment in this context would appear to place trainers working with large and powerful animals in positions of great danger from attack and serious or fatal injury.

Further, even if a trainer could maintain a dominance situation over the animal, punishment also has other behavioural implications due to the creation of unwanted avoidance and escape behaviours. Further, "emotional" responses such as fear, frustration and ultimately aggression can be produced.

Another major problem with the reliance on punishment to control behaviour is that it is a non-productive technique. Whilst it may remove an undesirable target behaviour, it will not address the development of alternative and more acceptable behaviours. Therefore, it can be concluded that punishment, at best, appears only to terminate (reduce) an undesired behaviour and its use in modifying and extending behavioural repertoires is limited. Further, any modifications of behaviour that may take place as a result of punishment, beyond the target behaviour, may also have serious and dangerous repercussions as outlined above in the discussion on dangers of creating a dominance hierarchical situation within the training context. 

Moreover, punishment applied without being supplemented by the training of a benign or acceptable behaviour can result in one unwanted and inappropriate behaviour being replaced by another unwanted and inappropriate behaviour. 

Clearly, punishment in isolation can present as many problems as it is supposed to solve. Therefore, one has to conclude that whilst punishment may bring results in suppressing unwanted behaviours in the short term, there are serious long-term repercussions for both the animal and trainer.

Does this totally rule out the use of punishment or negative reinforcement? Not completely, methods such as time-out can be used with some degree of success if they are low-key and mild. In the instance of misbehaviour from a dolphin, turning ones back for a brief moment on a misbehaving animal can be very effective. The development of techniques such as LRS (Lease Reinforcing Stimulus) shows a promising extension of this technique (Scarpuzzi, et al. 1991). However, the physical removal and isolation of animals from reinforcement appear to be counter productive. Moreover, as Sidman (1989) has pointed out, such removal in itself may inadvertently become a positive reinforcement rather than the desired punishment. 

Nevertheless, and as mentioned previously, it is important to recognise that punishments need to incorporate some form of guidance toward appropriate behaviour that may involve the further shaping of an alternative behaviour by positive reinforcement. Other strategies other than punishment, such as the training of incompatible behaviour that competes with the target misbehaviour, is a strategy that appears to have successful applications as an alternative to punishment. 

One thorny issue that has been detailed is that of use of food deprivation as an aversive stimulus to promote motivation and aid compliance. This is particularly of concern when training animals like dolphins. As this form of aversive stimulus can easily be applied even within the logistics of their aquatic life-style, a situation that makes some other forms of punishments difficult to apply. However, it can be seen that although allegations of food deprivation in training have been made (McKenna, 1992), a number of workers (Breland & Breland, 1966; Woods, 1973; Pryor, 1975) have observed that these animals do not appear to be effected or, indeed, need such coercion. The work of Nachtigall (1976) experimentally demonstrated that up to four days food deprivation does not appear to motivate an increase in a trained task. He, however, did find that body-weight parameters seem to be more important when manipulating rates of motivation.

This leaves open the possibility that a dolphin's motivation could be controlled by body weight manipulation. Therefore, it could be suggested that animals could be maintained at unnaturally low weighs to improve motivation. However, even among opponents of the maintenance of dolphins being kept in zoos or aquaria, there seems to be a dispute over this issue with some suggesting that captive dolphins are overweight due to the restriction of their environment (O'Sullivan, 1992). However, such comments may suggest that training programme should be implemented to ensure they obtain adequate exercise.

As can be seen above, aversive stimuli such as punishment and negative reinforcement can be used to modify the behaviour of animals both within animal societies in the wild and as a tool for behaviour modification. However, it is noted there are a number of problems with many of the methods that have been cited and in most, if not all cases, their application should be discouraged on both practicable and ethical grounds.

It has been shown that with the intelligent use of operant conditioning it is possible to produce behaviour modification and training techniques that can appear to be designed to be free of aversive coercion. 

An interesting example is the cited work of Pryor et al. (1969) in the development of novel and creative behaviours in dolphins. It is hard to imagine how such behaviour could have been developed using aversive methods with the behavioural restriction they impose. As has been pointed out, punishment tends to be very restrictive in its scope to develop any form of new behaviour. its prime purpose is to decrease and eliminate undesired target behaviour. As has been noted above, when it appears that punishment, even in its very mild form has to be used it is recommended that such situations are better supported by also training an additional alternate desired behaviour.

Finally, evidence has been presented that training animals, when they are maintained in the care of humans, can be of benefit to their physiological and psychological well-being (Kiley-Worthington, 1990; Laule, 1992). The development of environment enrichment technique in zoos, farms, and other establishments is an on-going subject of both research and application (Monaghan and Wood-Gush, 1990). Moreover, when given the opportunity animals can be trained to be allowed choice in their environmental conditions (van Rooijen, 1983).

Training in handling and husbandry can provide effective ways of veterinary and health inspection without the need for physical or chemical restraint.

It is clear that the intelligent and empathic use of training techniques, that owe much to their beginnings in the psychological laboratories of Skinner and his colleagues, can be of benefit to both the animals being trained, the humans that train them and to the general public. In the public domain, the opportunity to use animals as tools in providing public demonstrations in zoos and parks to promote both educational, conservation and environmental information are being developed but still have yet to reach their full potential. 

The opportunities for further continued research in animal training and behaviour modification are vast, e.g. with imaginative designs, farm and other animals could be trained to answer questions of preference such as temperature, accommodation and other environmental parameters. Many zoos and wild life parks have a commitment to conservation. This may require the reintroduction of endangered species back to wild habitats. This will also require the intelligent application of training and behaviour modification of the candidate animals to ensure their best chances of rehabilitation and survival. 

In conclusion, the question originally asked was does the carrot need the stick? It seems that the answer should be no. The research presented here demonstrates that enough information and protocols currently exist for there to be no excuse to use aversive and abusive techniques on animals to obtain their co-operation when they are asked to interact with humans.

REFERENCES

Bailey, R. (1965). US Navy Technical Publication 3838: Training open ocean release of an Atlantic bottlenose dolphin Tursiops truncatus (Montagu). Hot Springs, Arkansas: Animal Behaviour Enterprises, Inc.

Bailey, R. and Bailey, M. (1997) Personal communication.

Brill, R. (1981). R.I.R. in use at Brookfield Zoo: Random and Interrupted Reinforcement redefined in perspective. In J. Barry and R. Brill (Eds) Proceedings of the 1981 International Marine Animal Trainers Association Conference. San Diego: International Marine Animal Trainers Association.

Breland, K. and Breland, M. (1954). The new animal psychology. The National Humane Review. March.

Breland, K. and Breland, M. (1961). The misbehaviour of organisms. American Psychologist. 16. 681-684.

Breland, K. and Breland, M. (1966). Animal Behavior. London: Macmillan Co.

Caldwell, M.C. and Caldwell, D.K. (1971). Behaviour of marine mammals. In S. H. Ridgway (Ed) Mammals of the sea: Biology and medicine. Springfield, Illinois: Charles C. Thomas Publishers.

Chomsky, N. (1959). Review of Skinner's "verbal behaviour". Language, 35, 26-58.

Clutton-Brock, T.H. and Parker, G.A. (1995). Punishment in animal societies. Nature, 373, 209-216.

Clutton-Brock, T.H., Albon, S.D., Gibson, R.M. and Guinness, A.B. (1979). The logical stag: adaptive aspects of fighting in red deer (Cervus elephus). Animal Behaviour, 27, 211-225

Dawkins, M.S. (1980). Animal Suffering. London: Chapman and Hall.

Defran, R.H. and Pryor, K. (1980). Social behaviour and training of eleven species of cetacean in captivity. In L. Herman (Ed), Cetacean behaviour: mechanisms and functions. New York: Wiley-Interscience.

Dinsmoor, J.A. (1992). Setting the record straight: the social view of B. F. Skinner. Special issue: reflections of B. F. Skinner and psychology. American Psychologist, 47, 11.

Fernando, S.B.U. (1990). Training working elephants. In T.P. Poole, (Ed) Animal training: A review and commentary on current practice. Potter Bar: Universities Federation for Animal Welfare. 

Green, J. and Hicks, C. (1984). Basic cognitive processes. Milton Keynes: Open University Press.
Hediger, H. (1959). Psychology of animal in zoos and circuses. London: Butterworth

Hediger, H. (1970). Man and animal in the zoo: Zoo biology. London: Routledge and Kegan Paul.

Holland, J.G. and Skinner B. F. (1961). The analysis of behaviour. New York: McGraw-Hill.

Kazdin, A.E. (1989). Behaviour modification in applied settings. Pacific Grove, California: Brooks/Cole Publishing.

Kiley-Worthington, M. (1990). Animals in circuses and zoos: Chiron's world? Pitsea: Little Eco Farms Publishing.

Laule, G. (1992). Addressing psychological well being: Training as enrichment. The Shape of Enrichment, 1, 2, 11-12.

Martin, G. and Pear, J, (1996). Behaviour modification: What it is and how to use it - fifth edition. London: Prentice-Hall International.

Manning, A. and Stamp-Dawkins, M. (1992). An introduction to animal behaviour. Cambridge: Cambridge University Press.

McKenna, V. (1992). Into The Blue. London: Harper-Collins.

Monaghan, P. and Wood-Gush, D. (1990). Managing the behaviour of animals. London: Chapman and Hall.

Nachtigall, P.E. (1976). Food-intake and food-rewarded instrumental performance in dolphins as a function of feeding schedule. Unpublished dissertation: University of Hawaii.

Norris, K. S. (1974). The porpoise watcher. London: John Murray.

Norris, K.S. (1991). Looking at captive dolphins. In K. Pryor and K.S. Norris (Eds) Dolphin societies: Discoveries and puzzles. University of California Press: Oxford.

O'Sullivan, M. (1992). Into The Blue response. Turks and Caicos Free Press, October 10,
Pavlov, I.P. (1927). Conditioned reflexes: an investigation of the physiology activity of the cerebral cortex. New York: Dover.

Poole, T.P. (Ed) (1990). Animal training: A review and commentary on current practice. Potter Bar: Universities Federation for Animal Welfare. 

Pryor, K., Haag, R., and O'Reily, J. (1996). The creative porpoise training of novel behaviours. Journal of the experimental analysis of behaviour, 12, 653-661.

Pryor, K. (1973). Behaviour and learning in porpoises and whales. Naturwissenschaften, 60, 412-420.

Pryor, K. (1975). Lads before the wind: Adventures in porpoise training. New York: Harper and Row.

Pryor, K. (1981). The rhino likes violets. Psychology Today. April, 92-98.

Pryor, K. (1985). Don't shoot the dog: The new art of teaching and training. New York: Bantam.

Rooijen, J. van (1983). Operant preference tests with pigs. Applied Animal Ethology, 9, 87-88.

RSPCA (1986). Animals in circuses. Horsham: Royal Society for the Protection of Animals.

Scarpuzzi, M.R., Lacinak, C.T., Turner, T.N., Tompkins, C.D. and Force, D.L. (1991). Decreasing the frequency of behaviour through extinction: An application for the training of marine mammals. In S. Allen (Ed) Proceedings of the 1991 International Marine Animal Trainers Association Conference. San Diego: International Marine Animal Trainers Association.

Seymour, S. and Shepherson, D. (1990). Environmental enrichment report no. 3: Puzzle feeder for orang-utans. Potters Bar: Universities Federation for Animal Welfare and Zoological Society of London.

Sidman, M. (1989). Coercion and its fallout. Boston, MA: Authors Co-operative Inc.

Skinner, B.F. (1938). The Behavior of organisms. New York: Appleton-Century.

Skinner, B.F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts.

Skinner, B.F. (1961). Walden Two. New York: Appleton-Century-Crofts.

Skinner, B.F. (1971). Beyond freedom and dignity. Harmondsworth: Penguin.

Soulsby, E.J.L. (1986). Animals in society: A veterinary viewpoint. Potter Bar: Universities Federation for Animal Welfare.

Toates, F. and Slack, I. (1990). Behaviourism and its consequences. In I. Roth (Ed.) 

Introduction to psychology. Milton Keynes/Hove: Open University/Lawrence Erlbaum.

van der Toorn, J. (1987). The importance of variable reinforcement training for dolphin husbandry. In F. Krajniak and M. S. Bryant (Eds) Proceedings of the 1987 International Marine Animal Trainers Association Conference. San Diego: International Marine Animal Trainers Association.

Turner, R. N. (1964). Methodological problems in the study of behaviour. In W.N. Tavolga (Ed), Marine bioacoustics. Oxford: Pergamon Press.

Turner, T.N. and Tompkins, C. (1990). Aggression: Exploring the causes and possible reduction techniques. Soundings. 15, 2.

de Waal, F.B.M. (1989). Peace making among primates. Cambridge: Harvard University Press.

Watson, J. B. (1924). Behaviourism. Chicago; University of Chicago Press.

Wood, F.G. (1973). Marine mammals and man. New York: Robert B. Luce Inc.



This website and its content is copyright of John Dineley © John Dineley 2010. All rights reserved.