Forensic "Lie Detection": Procedures Without Scientific Basis

Viewing

The following copyrighted article is reproduced here by permission of The Haworth Press. The citation for this article is Iacono, William G. "Forensic 'Lie Detection': Procedures Without Scientific Basis," Journal of Forensic Psychology Practice, Vol. 1 (2001), No. 1, pp. 75-86. Page numbers are indicated in curly braces for citation purposes.

{75}

Forensic "Lie Detection":
Procedures Without Scientific Basis

William G. Iacono, PhD

ABSTRACT. This paper provides a critical overview of the scientific status of the control question test (CQT), the type of polygraph test most likely to be used in forensic settings. The CQT is based on an implausible set of assumptions that makes it biased against innocent individuals and easy for guilty persons to defeat using countermeasures. Due to serious methodological problems that characterize research on CQT validity, it is not possible to use the existing literature to provide a satisfactory error rate estimate. Scientists, including members of the Society for Psychophysiological Research and APA Fellows, hold negative views about the CQT. They do not believe that it is based on sound theory, that it has adequate psychometric properties, or that it should be used as evidence in court. [Article copies available for a fee from The Haworth Document Delivery Service: 1-800-342-9678. E-mail address: <getinfo@haworthpressinc.com> Website: http://www.HaworthPress.com]
KEYWORDS. Forensic, lie detection, Control Question Test, polygraph

Although there are several types of "lie detector" tests, psychologists are most likely to encounter use of the control question test {76} (CQT) in a forensic setting. The CQT is not so much a standardized test as it is a collection of procedures that combine interview techniques with physiological recording. These procedures vary in the nature of the interview, how questions are developed and ordered, and how the results are scored and interpreted. All CQTs, however, whether referred to as zone of comparison tests, modified general question tests, or directed lie tests, have certain elements in common that make it possible to evaluate them as a group. The aim of this article is to familiarize readers with this type of polygraphic interrogation and to critically evaluate it. Readers interested in a more comprehensive treatment of this subject matter, including coverage of other types of polygraph tests and detailed critiques of the scientific literature pertinent to the evaluation of polygraphy, will find ample coverage of these topics in other recent publications (Ben-Shakar & Furedy, 1990; Iacono, 1991; Iacono & Lykken, 1997a; Iacono & Lykken, 1999; Iacono & Patrick, 1997; Iacono & Patrick, 1998; Lykken, 1998; Saxe, 1994).

NATURE OF THE CQT

Because there is no characteristic physiological response associated with lying, it is not possible to ask a person to answer a relevant question about an alleged misdeed (e.g., "Did you stab John?"), record nervous system reactions, and make a determination of truthfulness. Polygraphy has attempted to circumvent this problem by including in addition to a relevant question, a comparison question that is also used to elicit physiological reactions (typically electrodermal activity, blood pressure, and respiration). The first polygraph tests used as comparisons what have been called irrelevant questions, items dealing with unimportant facts known to both examiner and suspect (e.g., " Are you sitting down?", "Is today Tuesday?"). If a suspect responds more strongly to the relevant question, guilt is indicated, while similar sized responses to both types of questions signifies innocence. This relevant/irrelevant test (RIT) format has been found wanting even by proponents of polygraphy because the irrelevant items do not provide an adequate control for the emotional impact of simply being presented with the accusatory relevant question. Because the relevant question is more likely to be physiologically arousing than the irrelevant question even to an innocent person, it is perhaps not surprising that the RIT is widely believed to be strongly biased against innocent {77} persons (Horowitz, Kircher, Honts, & Raskin, 1997). Hence, it is seldom used in forensic investigations.

The CQT represents an effort to circumvent the problems inherent to the RIT by introducing a so-called "control" question, the response to which is compared to the relevant question. Control questions are intended to elicit a lie or at least concern by posing a vague question covering possible minor misdeeds from a person's past. Examples of control questions include "Have you ever hurt someone to get revenge?', or "Have you ever lied to a person in a position of authority?" The theory of the CQT assumes that it is highly likely that you have hurt someone or lied to an authority figure, so these questions will provide an example of what your physiological reactions to lies look like. CQT theory further assumes that innocent people, because they are being truthful when they answer relevant questions "no," will be relatively unaroused by these questions. Instead, they are expected to be worried about their response to the control questions, so these items should produce larger reactions. Guilty individuals, even though presumably lying to both questions, are expected to respond more strongly to the relevant question because it carries greater significance.

Hence, for the CQT to be valid, two assumptions must hold. The first requires innocent individuals to be more responsive to control than relevant questions. The second requires guilty persons to respond more intensely to relevant than control questions. The plausibility of both of these assumptions can be easily challenged.

A major limitation of the CQT is that, as was the case for the RIT, the comparison questions do not offer an adequate psychological control for the emotional impact of being asked the accusatory relevant question. Most innocent people are savvy enough to understand that whether they pass or fail the test will depend on how they respond to the relevant questions rather than to the trivial issues covered by the control questions. That is, the relevant questions are the most important questions on the test for both innocent and guilty people. One reason why we might be tempted to believe that lie detection works stems from the common knowledge that lying is often associated with anxiety and accompanied by physiological arousal. However, we are likely to respond similarly when confronted with a false accusation, even when truthfully denied. It is for this reason that the CQT, like the RIT, is also biased against innocent individuals.

{78} The notion that guilty suspects will necessarily respond more strongly to relevant than control questions is unlikely for several reasons. The work of Honts and colleagues (Honts, Raskin, & Kircher, 1994) indicates that the guilty can beat a CQT by augmenting artificially their responses to control questions. This can be accomplished using simple countermeasures such as curling the toes, lightly biting the tongue, or performing mental arithmetic when control questions are asked. Because information regarding how to use these types of countermeasures is readily available in bookstores, libraries and through the worldwide web, examiners are unlikely to know when a subject possesses countermeasure knowledge. Moreover, because the use of these procedures is invisible to the examiner, it is not possible to determine when they are employed (Honts et al., 1994).

The work of Honts et al. shows that with less than 30 minutes of instruction, during which subjects are taught to recognize control questions and are told to use these countermeasures in response to them, 50% or more of guilty individuals subsequently defeat the CQT. Although polygraph proponents assert that typical polygraph subjects may not be able to figure out how to use countermeasures on their own, this hypothesis has not been adequately evaluated. There are studies that have addressed this issue by having introductory psychology course students, guilty by virtue of carrying out a mock crime, try to defeat the CQT, but these studies have done little to adequately motivate subjects to succeed. For instance, in one experiment where subjects were specifically taught countermeasures (Honts, Hodes, & Raskin, 1985), although students were offered double course credit if they could beat the CQT, over 20% of the subjects admitted not trying countermeasures, and it is not known how many others failed to comply with the study protocol.

A possible advantage of the typical CQT is that when questions are introduced to subjects, no distinction is made between control and relevant questions. Hence, some unsophisticated individuals may not readily understand how important it is to respond more strongly to control questions if they expect to pass the test, or that passing the test would be assured if they artificially augmented their responses to control questions. A similar advantage is not conferred by a variant of the CQT that is gaining more widespread use: the directed lie test (DLT). With the DLT, the control questions are replaced with directed lie questions that the subject is told are designed to elicit a response to {79} a known lie. Examples of directed lies include "Have you ever told even one lie?" and "Have you ever made even one mistake?" Subjects are told to think of specific incidents of lies or mistakes when answering these questions "no" to deliberately provoke a reaction to a lie. Given the transparent purpose of directed lies, even psychologically naïve guilty subjects are likely to comprehend the value of an augmented response to directed lie control questions, making this CQT variant especially vulnerable to countermeasures.

Another factor undermining the CQT's validity with guilty suspects concerns habituation to the charges covered by the relevant question. Often subjects are given lie detector tests long after a crime was committed and only after repeatedly denying the accusations covered by relevant questions. With repeated presentation to emotionally charged material, autonomic responses habituate over time. Hence, the issues raised by relevant questions on a CQT are likely to lose their potency over time, eventually eliciting weak autonomic nervous system reactions. Control questions, by contrast, represent novel issues that are unlikely to have come up prior to the polygraph test, and therefore can be expected to elicit relatively strong responses in some individuals. These factors are also likely to lessen the likelihood that guilty individuals will respond more strongly to relevant than control questions.

CQT ERROR RATE

Unfortunately, it is not possible to use the existing polygraphy literature to accurately estimate the validity of the CQT. This state of affairs persists for a variety of reasons (Iacono & Lykken, 1997a), the most important of which is that it is very difficult to conduct research on polygraph test validity that provides an accurate estimate of how well the CQT works in real life.

Laboratory Studies. Two types of validity study are possible. One relies on laboratory investigations where volunteer subjects, often undergraduate students seeking course credit, are asked to simulate a crime. These mock crime studies are too unlike real life to offer any realistic insight to how polygraph tests work in the field. For instance, the consequences of failing a test are trivial; the privacy-invading control questions are apt to be more disturbing to innocent subjects than relevant questions that carry little psychological significance; and {80} guilty individuals are tested immediately after the simulated crime, minimizing the likelihood of habituation to the issues raised in relevant questions. In addition, because the stakes are so low, it matters little if a guilty person fails a laboratory-based CQT, so there is no natural incentive to develop or learn about countermeasure strategies.

Field Studies. Field studies of real-life cases provide the best opportunity for estimating CQT error rates. However, studies based on such cases are severely limited by methodological problems that are generally ignored by CQT proponents. The main advantage of laboratory studies, that one can be certain of ground truth (who is guilty and who is innocent), is the major disadvantage of field studies. Typically ground truth is established by using confessions to identify the guilty and exculpate co-suspects in the same case. Once ground truth is established, the polygraph charts of these individuals can be blindly scored and hit rates for verified guilty and innocent individuals can be determined. About a dozen field studies have been carried out using confessions to determine ground truth (for critiques of these studies, see Iacono & Lykken, 1997a; Iacono & Lykken, 1999; Iacono & Patrick, 1987). These studies vary widely in their methodological sophistication and in reported hit rates, which range from about 50% accuracy for innocent people to nearly 100% for guilty suspects. However, as explained below, these confession-based investigations have one serious flaw in common: the ground truth criterion is not independent of the outcome of the polygraph test.

A major goal of polygraph testing is to solve crimes by extracting occasional confessions from those who fail the tests. Indeed, it is this benefit of polygraph tests that justifies their use in the absence of compelling validity data. Law enforcement agencies tend to administer polygraph tests only in certain cases (Iacono, 1991; Patrick & Iacono, 1991). For most of these cases, investigative efforts have failed to yield compelling incriminating evidence, and it is likely the case will go unsolved unless a suspect confesses. It is at this point in an investigation that suspects are likely to be asked to take a polygraph test. In a case with multiple suspects, only one of whom could be guilty, suspects are tested until one fails. This individual is then presented with the results of the test, and subsequently interrogated in an effort to extract a confession. Even if no confession is obtained, the case will be considered resolved, with the individual who failed the test identified as the suspect most likely to be guilty. If there are other {81} suspects in the same case who have not yet been polygraphed, it is unlikely that they will be tested because: (a) polygraph examiners believe the CQT to be highly accurate, so they are comfortable with the conclusion that the individual with the "deception indicated " polygraph verdict is in fact guilty; and (b) polygraph testing involves an expenditure of a limited resource that could best be applied to other cases which, unlike the one in question, are still in need of resolution.

It is against this backdrop that scientists doing field research on polygraph accuracy must select cases, relying only on those that elicited a confession from one of the suspects. Once such cases have been selected, the standard procedure is to have the polygraph charts blindly rescored by an examiner with no knowledge of the case facts, and to compare the verdicts from these rescored charts to confession verified ground truth in the particular case. Because polygraph test scoring is highly reliable (e.g., when 267 blindly rescored polygraph charts were compared to the scores of original police examiners, the interscorer reliability was .93; Patrick & Iacono, 1991), the blindly rescored charts will almost always match the scoring of the original examiners. There are a number of reasons why this field study research methodology will overestimate polygraph test accuracy.

Selecting cases because they generated a confession systematically eliminates from study all cases where the original examiner made an error. If an innocent person failed a polygraph test, this error would go undetected because there would be no confession. Absent this ground truth criterion, the case would not be included in the field validity study. If a guilty person passed the test, again there would be no confession, and the case would not be selected for study. In fact the only cases selected for study would be those where the original examiner was both correct and obtained a verifying confession. For these selected, unrepresentative cases, the original examiner must be correct 100% of the time. Because chart scoring is so reliable, the blind rescoring of the charts is likely to match the original examiner's scoring, and is thus likely to match ground truth. Hence, the validity estimate derived from the degree to which the blindly rescored charts match ground truth will yield a greatly inflated, misleading estimate of polygraph test accuracy.

A scientifically sound field validity study must remove the confound that exists when cases are selected only when a confession followed a test scored deceptive. To date, there has been only one {82} study that has attempted to circumvent this methodological problem (Patrick & Iacono, 1991). In this study, which was carried out with the Royal Canadian Mounted Police, cases were selected because no confession was obtained following the administration of polygraph tests. The researchers examined evidence subsequently collected, as police investigations of the cases continued, to identify those cases that ultimately ended with a confession. These confessions, which were not dependent on polygraph test outcome, were used to establish ground truth and were compared to the results of blindly rescored polygraph tests administered in the case. Two interesting results emerged. First, the accuracy of the CQT with confession verified innocent suspects was only 57%. Second, it was not possible to estimate accuracy for guilty subjects because once a suspect failed a polygraph test, even with no confession, the police generally stopped investigating the case, so no new evidence was developed to establish ground truth. The cases that the police continued to investigate after suspects were polygraphed were generally those that ended without strong polygraphic indications of guilt, and some of these cases turned up suspects who confessed but who did not take a polygraph test. Hence, their confessions verified that those who took the polygraph test were innocent, and it was the inclusion of cases like these that made it possible to estimate polygraph test accuracy for innocent individuals.

Friendly Tests. Defense attorneys representing defendants who have passed a CQT often ask the court to admit the test results. These types of tests have been characterized as "friendly" because the outcome of the test is protected by attorney-client privilege: results are released only if a truthful verdict is obtained. Under the circumstances, defendants have little to lose by taking the test, and the fear of being identified as guilty that exists when the police administer an adversarial CQT is absent with a friendly test. Unfortunately, field studies of the CQT are based on the outcomes of adversarial tests. There are no investigations regarding the validity of friendly tests. Given that the fear of detection likely to exist with friendly tests must be substantially lower than that for adversarial tests, it is likely that the false negative error rate associated with friendly tests is higher than that for adversarial tests.

{83} GENERAL ACCEPTANCE
IN THE SCIENTIFIC COMMUNITY

Because the existing literature on the CQT cannot be used to provide a satisfactory accuracy estimate and experts are sharply divided on this issue, the opinions of unbiased scientists capable of evaluating the soundness of the CQT and the claims of polygraph proponents are of considerable interest. To obtain this information, David Lykken and I (lacono & Lykken, 1997b) carried out two surveys of the members of two different scientific organizations varying in the type of expertise and perspective they bring to the evaluation of polygraphy. The first survey was of members of the Society for Psychophysiological Research (SPR; 91% response rate) and the second was of Fellows of the American Psychological Association's Division of General Psychology (APA-DGP; 74% response rate).

Two earlier surveys, neither of which was published in a scientific journal, were carried out with SPR members. Responses to a single question, repeated across both surveys, have been used to argue that scientists have a favorable view of forensic polygraphy because about 61% of respondents to both surveys agreed that "polygraph test interpretation" constituted "a useful diagnostic tool when considered with other available information." We found it difficult to interpret this survey result because the question dealt neither with the CQT nor its use in court. In addition, the question does not specifically address validity, and no questions were asked regarding either the soundness of the psychological foundation of the CQT or the claims of high accuracy made by the pro-polygraph community. To remedy these deficiencies, we made clear to respondents that we were querying them about the CQT and included a description of the test and its theoretical basis from one of the CQT's leading proponents (Raskin, 1986). We also asked a variety of questions relevant to the scientific controversies surrounding the CQT.

The results of our survey indicated that the members of both scientific organizations had very similar opinions about the CQT. Only about a third of those surveyed believed the CQT was scientifically sound, and only about a quarter thought polygraph evidence should be admissible in court. Few opined that the CQT was a standardized test (20%) or objective (10%). The great majority held that the CQT could be defeated by countermeasures that were easily learned and they {84} expressed substantial skepticism in the validity of friendly CQTs and in the accuracy claims of polygraph proponents. When members of both organizations were divided into subgroups according to their level of self-professed expertise on CQT validity, there was little difference in their opinions as a function of how informed they were on the topic, with the vast majority of both more and less informed respondents from both organizations possessing a strongly negative view of the CQT.

CONCLUSION

Although the CQT may be useful as an investigative aid and tool to induce confessions, it does not pass muster as a scientifically credible test. CQT theory is based on naive, implausible assumptions indicating (a) that it is biased against innocent individuals and (b) that it can be beaten simply by artificially augmenting responses to control questions. Although it is not possible to adequately assess the error rate of the CQT, both of these conclusions are supported by published research findings in the best social science journals (Honts et al., 1994; Horvath, 1977; Kleinmuntz & Szucko, 1984; Patrick & Iacono, 1991). Although defense attorneys often attempt to have the results of friendly CQTs admitted as evidence in court, there is no evidence supporting their validity and ample reason to doubt it. Members of scientific organizations who have the requisite background to evaluate the CQT are overwhelmingly skeptical of the claims made by polygraph proponents.

AUTHOR'S NOTES

William G. Iacono is a Distinguished McKnight University Professor and Director, Clinical Science and Psychopathology Research Training Program at the university of Minnesota. He is also a Past President, Society for Psychophysiological Research and recipient of Early Career Awards for Distinguished Scientific Contributions from the Society for Psychophysiological Research and the American Psychological Association. Author of over 150 publications, he has been a consultant regarding polygraph testing to the CIA, Department of Defense Polygraph Institute, the U.S. Congress' Office of Technology Assessment, and the Clinton Administration's Joint Security Commission. An expert witness in over two dozen trials in state and federal courts, his work was cited in the U.S. Supreme Court's decision upholding the ban of polygraph evidence in military trials (U.S. v. Scheffer). His research interests include psychophysiology, psychopathology, and behavior genetics.

{85} Address correspondence to: William G. Iacono, PhD, Department of Psychology, University of Minnesota, 75 East River Road, Minneapolis, MN 55455 (E-mail: wiacono@tfs.psych.umn.edu).

REFERENCES

Ben-Shakar, G., & Furedy, J. J. (1990). Theories and applications in the detection of deception. New York: Springer-Verlag.

Honts, C. R., Hodes, R. L., & Raskin, D. C. (1985). Effects of physical countermeasures on the physiological detection of deception. Journal of Applied Psychology, 70, 177-187.

Honts, C. R., Raskin, D. C., & Kircher, J. C. (1994). Mental and physical countermeasures reduce the accuracy of polygraph tests. Journal of Applied Psychology, 79, 252-259.

Horowitz, S. W., Kircher, J. C., Honts, C. R., & Raskin, D. C. (1997). The role of comparison questions in physiological detection of deception. Psychophysiology, 34, 108-115.

Horvath, F. (1977). The effect of selected variables on the interpretation of polygraph records. Journal of Applied Psychology, 62, 127-136.

Iacono, W. G. (1991). Can we determine the accuracy of polygraph tests? In J. R. Jennings, P. K. Ackles, & M. G. H. Coles (Eds.), Advances in psychophysiology (Vol. 4, pp. 201-207). London: Jessica Kingsley Publishers.

Iacono, W. G., & Lykken, D. T. (1997a). The scientific status of research on polygraph techniques: The case against polygraph tests. In D. L. Faigman, D. Kaye, M. J. Saks, & J. Sanders (Eds.), Modern scientific evidence: The law and science of expert testimony (pp. 582-618). St. Paul, MN: West Publishing.

Iacono, W. G., & Lykken, D. T. (1997b). The validity of the lie detector: Two surveys of scientific opinion. Journal of Applied Psychology, 82, 426-433.

Iacono, W. G., & Lykken, D. T. (1999). The scientific status of research on polygraph techniques: The case against polygraph tests. In D. L. Faigman, D. H. Kaye, M. J. Saks, & J. Sanders (Eds.), Modern scientific evidence: The law and science of expert testimony. 1999 Pocket Part. (Vol. 1, pp. 174-184). St. Paul, MN: West.

Iacono, W. G., & Patrick, C. J. (1987). What psychologists should know about lie detection. In A. K. Hess & I. B. Weiner (Eds. ), Handbook of forensic psychology. New York: John Wiley.

Iacono, W. G., & Patrick, C. J. (1997). Polygraphy and integrity testing. In R. Rogers (Ed.), Clinical assessment of malingering and deception (2nd ed., pp. 252-281). New York: Guilford.

Iacono, W. G., & Patrick, C. J. (1998). Polygraph ("lie detector") testing: The state of the art. In I. B. H. Weiner, A.K. (Ed.), Handbook of forensic psychology. 2nd Ed. (2nd ed., pp. 440-473). New York: John Wiley.

Kleinmuntz, B., & Szucko, J. J. (1984). A field study of the fallibility of polygraphic lie detection. Nature, 308, 449-450.

Lykken, D. T. (1998). A tremor in the blood: Uses and abuses of the lie detector. (2nd ed.). New York: Plenum.

Patrick, C. J., & Iacono, W. G. (1991). Validity of the control question polygraph test: The problem of sampling bias. Journal of Applied Psychology, 76, 229-238.

{86}

Raskin, D. (1986). The polygraph in 1986: Scientific, professional and legal issues surrounding applications and acceptance of polygraph evidence. Utah Law Review, 1, 29-74.

Saxe, L. (1994). Detection of deception: Polygraph and integrity tests. Current Directions, 3, 69-73.

Viewing