Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
Contents lists available at SciVerse ScienceDirect
Journal of Applied Research in Memory and Cognition journal homepage: www.elsevier.com/locate/jarmac
Age-progressed images may harm recognition of missing children by increasing the number of plausible targets Steve D. Charman ∗ , Rolando N. Carol Department of Psychology, Florida International University, United States
a r t i c l e
i n f o
Article history: Received 7 February 2012 Received in revised form 10 July 2012 Accepted 11 July 2012 Available online 20 July 2012 Keywords: Age progression Missing children Multidimensional face space
a b s t r a c t Age progression, often used to help find missing children, is a technique whereby an outdated photograph of an individual is used to generate an updated image of that individual. Despite its importance, few empirical psychological studies have tested the utility of age progression. The current studies had two purposes: (1) to empirically test the effectiveness of a computerized age-progression system (APRILage); and (2) to examine how the presentation of an age-progressed image changes observers’ decision-making strategies. Presenting participants with an age-progressed image in addition to an outdated image resulted in fewer target recognitions and more mistaken non-target ‘recognitions’ (Study 1; N = 135), led participants to assign fewer confidence points to the actual target and more confidence points to non-targets (Study 2; N = 231), and increased, rather than decreased, the number of plausible targets (Study 3; N = 88). Results are explained within a multidimensional face-space conceptualization of facial recognition.
1. Introduction When attempting to locate a child that has been missing for a substantial period of time, it is beneficial to predict accurately that child’s current appearance. This is purportedly accomplished via a forensic technique known as “age progression,” in which an old photograph of the missing person is used to produce an image that estimates that person’s current appearance (Allen, 1990; McQueen, 1989). It is claimed that age progression has helped to recover one of out every seven children reported missing to the National Center for Missing and Exploited Children (Thompson & Black, 2006), and that in almost every case in which age progression is used, new leads are generated (Office of Juvenile Justice and Delinquency Prevention, 2006). Despite these anecdotal reports, there has been little research devoted to developing and testing age-progression systems. Most data come from non-psychologists – such as computer vision scientists – whose goals are to produce algorithms that predict how faces age (e.g., Gibson, Scandrett, Solomon, Maylin, & Wilkinson, 2009; Ramanathan & Chellappa, 2006; Scandrett, Solomon, & Gibson, 2007). The accuracy of their algorithms is assessed via the mathematical similarity between the age-progressed image and the target individual, which, although well-suited for their purposes, is insufficient to establish the effectiveness of age-progression techniques in the real world, which requires assessment by human subjects.
∗ Corresponding author at: Department of Psychology, DM 256, Florida International University, Miami, FL 33199, United States. E-mail address: charmans@fiu.edu (S.D. Charman). 2211-3681/$ – see front matter http://dx.doi.org/10.1016/j.jarmac.2012.07.008
Unfortunately, psychological research on the effects of ageprogressed images on recognition is extremely sparse: Currently only two published papers have reported studies that have examined this process empirically (Lampinen, Arnal, Adams, Courtney, & Hicks, 2012; Lampinen, Miller, & Dehon, 2012). The studies used one of two basic methodologies. In studies of prospective person memory, participants were presented with either an outdated image of a child, a current image of the child, or an age-progressed image of a child (which was either accompanied by the outdated image or not, depending on the specific study) and were exposed to a series of children and asked to indicate whether any of them were the target. In studies of retrospective person memory, participants viewed the series of children first, were then shown the outdated/age-progressed/current image, and finally asked to retrospectively recall if they recognized the target from the earlier series of children. Results were similar across studies: an age-progressed image, either by itself or accompanied by an outdated image, failed to increase target recognition above and beyond that obtained from an outdated picture alone. Null results are, of course, difficult to interpret. It may be that age-progressed images truly have no effect on observers’ abilities to recognize a target, or it may be that the specific paradigm used was not sensitive enough to detect those effects.1 In fact, both
1 This is not a criticism of Lampinen and colleagues’ methodology; to the contrary, their studies were very well-designed to be reflective of real world situations in which people may have to recognize a missing child. Nonetheless, without any prior empirical guidance as to the conditions that promote or inhibit age-progression effects, it is possible that the methodology they chose happened to minimize these effects.
172
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
studies contain data suggesting that age-progressed images may, in fact, have an effect on recognition: They may actually harm it. Lampinen, Arnal, et al. (2012) showed that if they combined data across their two separate studies, there was a marginally significant trend for age-progressed images to harm recognition. Lampinen, Miller, et al. (2012) showed no significant effect of age-progressed images across three studies, but a significant decrease in target recognition in a fourth. It remains to be seen whether these hints at potentially deleterious effects of age progression are reliable. The goal of our first study is thus to provide a more sensitive test of the effects of age progression on target recognition. This is accomplished in three ways. First, instead of using forensic artists to produce age-progressed images, we used computerized software (APRILage) for three reasons: (a) unlike a given computerized software system, forensic artists are not standardized, variability in techniques across forensic artists is likely large, and thus the generalizability of such findings would be correspondingly limited; (b) people’s ability to predict outcomes is inferior, almost without exception, to computer models’ abilities to do the same (Dawes, Faust, & Meehl, 1989), suggesting that computerized age-progression software offers perhaps the best chance of accurately predicting a target’s face; (c) APRILage software is available for purchase by police departments (and is currently used by multiple police departments), its website specifically claims that it can be used to “assist in finding lost and missing children” (www.aprilage.com), and a copy of its software has recently been sent to the National Center for Missing and Exploited Children for evaluation and possible adoption (Hogan, personal communication, March 1, 2011). Second, it is possible that age-progressed images are beneficial, but only if directly in view at the time of the judgment. The methodology of Lampinen, Arnal, et al. (2012) and Lampinen, Miller, et al. (2012) prohibited participants from viewing the target images at the same time they looked at photographs of children, which may have decreased the usefulness of the age-progressed images.2 Consequently, we allowed participants to view the target images while making their judgments. Third, Lampinen, Arnal, et al. (2012) and Experiments 1 and 2 of Lampinen, Miller, et al. (2012) used children who had undergone appearance change over the course of five years (from age 7 to age 12) as targets.3 It is possible that this five-year difference did not produce sufficient appearance change to allow age-progressed images the chance to improve recognition. Thus, we increased the degree of appearance-change (to a 15-year difference: from age 6 to age 21) in order to ensure that there was enough appearance change to allow age-progressed images a better opportunity to produce a beneficial effect. 1.1. The goal of the current studies Study 1 was designed to provide a sensitive test of APRIL ageprogression software. However, it should be noted that this is not the primary goal of our manuscript as a whole. To anticipate Studies
2 Of course, in the real world, people usually are not looking at an age-progressed image at the same time as making judgments about whether they recognize someone as being a missing child (although it is certainly possible to envision scenarios in which that would happen). Nonetheless, our goal was to examine whether there are any conditions under which age-progressed images could help. If age-progressed images fail to improve recognition rates even under conditions that maximize their potential effectiveness, then arguments against the use of that technique are made even stronger. 3 Experiments 3 and 4 of Lampinen, Miller, et al. (2012) did use a target who had undergone appearance change of approximately 18 years. However, these studies used only the same single target, and thus it is unknown whether their findings generalize across other targets.
2 and 3, our primary research question addresses how the presence of an age-progressed image changes observers’ decision-making strategies when attempting to recognize a target. Study 1, in this light, is useful in exhibiting an age-progression effect; Studies 2 and 3 will examine the psychology that underlies this effect. It is this psychology in which we are most interested. 2. Study 1 For age-progressed images to have forensic value, they must increase target recognition rates above and beyond recognition rates obtained while viewing only an old, outdated target image. Thus, Study 1 examined whether the addition of an age-progressed image to an outdated image of a target affected correct recognition rates. 2.1. Method 2.1.1. Participants Participants were 135 college undergraduates (64% female4 ) from a large southern university. 2.1.2. Stimuli In order to increase our generalizability across specific facial stimuli (see Wells & Windschitl, 1999), five target individuals were used. A separate group of undergraduate psychology students provided pictures of themselves at or around age 6 (the outdated images) in exchange for research credit. A photograph was taken of each person who contributed an outdated image (around 21 years of age; the current images). Finally, all outdated images were processed using APRIL’s online age-progression system (www.ageme.com) and age-progressed to age 21 (the age-progressed images). According to APRIL’s official website, the photographs that are used in age-progression should meet certain criteria. Specifically: (1) the subject must be looking straight at the camera; (2) no hair, facial or otherwise, should obstruct any part of the face; (3) the subject should not smile or squint; (4) only color photographs should be used; (5) the subject should be in sharp focus; (6) lighting must be flat and even, with no shadows on the subject’s face; (7) the background must be plain (i.e., no patterns or designs) and it must contrast with the subject’s hair color; and (8) the photograph cannot be over- or under-exposed. APRIL also requires the user to input certain information about the subject to be age-progressed, including his/her age at the time the photograph was taken, gender, ethnicity (Caucasian, African, Asian, Latino-Hispanic, South-Asian, or Other), smoking habits (smoker or non-smoker), and weight (normal, heavy weight, overweight, or obese). Although APRIL is proprietary software (and therefore its algorithms are kept secret), the fact that this information is required leads us to assume that its age-progression algorithms take these properties into account. Other than the choice of photograph and the above-listed information, the ageprogression process is fully automated and proceeds without additional user input. The five clearest outdated images that met all of the above-listed criteria (2 Hispanic females, 2 Hispanic males, 1 White female), were age-progressed using APRIL age-progression software. For each of the five targets an array of individuals was constructed consisting of the target’s current image and 3 photographs of people
4 Data concerning age and race were not collected in Study 1; however, the participant pool was the same as that for Studies 2 and 3, and the make-up of participants should be approximately equal to that of those studies (i.e., Mage is approximately 20 years; approximately 75% Hispanic).
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
Image condition Outdated
Outdated + age-progressed
Age-progressed
Target recognitions Target 1 91.1 Target 2 73.3 Target 3 2.2 Target 4 88.9 Target 5 24.4
75.6 44.4 8.9 62.2 40.0
55.6 33.3 8.9 40.0 26.7
Average
46.2
32.9
Non-target ‘recognitions’ Target 1 6.7 Target 2 17.8 Target 3 13.3 Target 4 4.4 Target 5 22.2
22.2 35.6 35.6 17.8 11.1
22.2 46.7 42.2 15.6 20.0
Average
12.9
24.5
29.3
‘Not there’ responses Target 1 2.2 Target 2 8.9 Target 3 84.4 Target 4 6.7 Target 5 53.3
2.2 20.0 55.6 20.0 48.9
22.2 20.0 48.9 44.4 53.3
Average
29.3
37.8
56.0
31.1
that matched the target’s sex, race, approximate current age, and hair color. 2.1.3. Procedure All stimuli were presented on a computer (19 in. monitor; 1280 × 1024 resolution). Participants were assigned randomly to one of three image conditions in which they viewed either (1) the outdated target image, (2) the age-progressed target image, or (3) both the outdated and the age-progressed target images. Below their assigned image(s), participants also viewed the four array members’ faces simultaneously (the current target image was always included in the array) and were asked either to identify the target from the array or report that the target was not in the array. The position of the target within each simultaneous array was randomized for each target. This procedure was repeated for each of the five targets. Participants were in the same experimental condition for all targets. See Supplementary Materials for all target images (outdated and age-progressed) and their corresponding arrays. 2.2. Results Table 1 displays the percentages of participants who recognized the target, mistakenly ‘recognized’ a non-target, and said that the target was not in the array as a function of the individual targets. However, because our focus is not on any individual target, we did not perform statistical analyses at this level. Instead, we calculated three scores for each participant – a target recognition score, a false recognition score, and a ‘not there’ score – which represented the total number (out of five) of correct target recognitions, incorrect non-target ‘recognitions,’ and ‘not there’ responses, respectively. Analyses are performed on these scores. Results can be seen in Fig. 1. All tests are two-tailed. Because the critical question is whether the addition of an age-progressed image to an outdated image improves target recognition scores above and beyond the presence of an outdated image
3.5
Score (out of 5)
Table 1 Percentage of participants in Study 1 who made correct target recognitions, mistaken non-target ‘recognitions,’ and ‘not there’ responses as a function of individual target and image condition Image condition.
173
Outdated image
3
Outdated + age-progressed images Age-progressed image
2.5 2 1.5 1 0.5 0 Target
Non-targets
Not there
Response Fig. 1. Correct target recognition score, mistaken non-target recognition score, and ‘not there’ score as a function of image condition for Study 1. Error bars indicate standard errors.
alone, we first compared scores of participants in the outdated image condition to scores of participants in the outdated + ageprogressed image condition. These analyses revealed that the addition of an age-progressed image significantly harmed target recognition scores (Ms = 2.8 and 2.3), t(88) = 2.36, p = .02, d = .50, and significantly inflated false recognition scores (Ms = .6 and 1.2), t(88) = 2.91, p = .005, d = .62. The addition of an age-progressed image had no significant effect on ‘not there’ scores (Ms = 1.6 and 1.5, for the outdated image and outdated + age-progressed image conditions, respectively), t(88) = .45, p = .65, d = .10. Participants who viewed only an age-progressed image performed even worse; they had a significantly lower target recognition score (M = 1.6), t(88) = 3.03, p = .003, d = .65, and a marginally higher ‘not there’ score (M = 1.9), t(88) = 1.83, p = .07, d = .39, than participants who viewed both images. They did not have a significantly different false recognition score (M = 1.5), t(88) = 1.12, p = .27, d = .24. 2.3. Discussion Overall, the addition of an age-progressed image to an outdated image harmed people’s ability to identify the target, and increased their likelihood of mistakenly ‘recognizing’ a non-target. Note that the age-progressed images were not simply decreasing the likelihood of recognizing anyone (since they did not increase ‘not there’ scores), but they seemed to be systematically leading people away from recognizing the target (and toward mistakenly ‘recognizing’ non-targets). This is the only study that we are aware of to show, using multiple targets, that age-progressed images may actually harm one’s ability to recognize a target. This result is intriguing and counterintuitive: If the age-progressed image was a poor representation of the target, participants who viewed both an outdated and an age-progressed image could have simply ignored it and relied solely upon the outdated image. But they clearly did not: In fact, they performed worse than participants who viewed only the outdated image. This leaves us with two main questions. First, why are the ageprogressed images produced by APRIL age-progression software so (apparently) poor? Unfortunately, because the software is proprietary, we cannot investigate its algorithms to uncover why it produced apparently poor age-progressed images. However, this question is not particularly interesting from a psychological point of view anyway, and is better addressed by computer scientists or others who create age-progression software. The second question – and the much more interesting one from a psychological point of view – is the following: Given that ageprogressed images are poor (regardless of why they are poor),
174
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
why would their presence lead people away from the actual target despite the presence of the outdated image? In other words, why do participants in the outdated + age-progressed image condition perform worse than participants in the outdated image condition? This puts the focus of Studies 2 and 3 not on the effectiveness of age-progression techniques per se (although our results speak to this issue), but squarely on the decision-making processes of observers, and how they change as a function of the addition of an age-progressed image. This focus on observers’ decision-making process is important, because our finding that the addition of an age-progressed image not only failed to improve target recognition, but actually harmed target recognition, is not easily dismissed as solely a product of the poor performance of APRIL software. Certainly it suggests that APRIL produces poor age-progressed images, but note that even the worst age-progression system one could imagine would be insufficient to fully explain our results. The reason is because even if the age-progressed image was useless, observers could have simply dismissed it from consideration; after all, they still had access to the outdated image. Our findings thus suggest that the detrimental effect of age-progressed images is at least partly a psychological effect: The addition of an age-progressed image somehow changes observers’ decision-making strategies, and does so specifically in a suboptimal way. How might these strategies change? We analyze this problem theoretically by borrowing from models of perceived facial similarity. Such models conceptualize the set of all faces as residing within a multi-dimensional “face space,” with each dimensional axis representing a different physiognomic facial feature used to encode faces (e.g., Tredoux, 2002; Valentine, 1991). The perceived similarity between any two faces is represented by the distance between them in this multi-dimensional space. Age progression can thus be thought of as a process in which an outdated image is altered to produce an age-progressed image that resides at a different point in face space. If age-progression were effective, this new, age-progressed face would be closer in face space to the target’s current face than the old, outdated face. Drawing from signal detection models, we postulate that an observer will decide that two facial images represent the same person if the perceived similarity of the faces surpasses a decision criterion. We can think of this decision criterion in relation to the aforementioned face space models: Because face space represents similarity in multiple dimensions, this criterion would effectively be a shell that bounds a region in face space (this bounded region is henceforth called the critical region). A face that falls within the critical region of another face (i.e., that is sufficiently similar to that face) is considered to be of the same person; otherwise, it is considered to be of a different person. The addition of an age-progressed image to an outdated image may harm target recognition for one of at least three reasons. A twodimensional analog to each of these three possibilities is shown in Fig. 2. First, an age-progressed image may lead people to ignore the outdated image and rely entirely on the age-progressed image in making their recognition judgments. If this age-progressed image is a poorer likeness of the target than the outdated image (i.e., it resides at a point in face space further from the target’s current face than the outdated image), target recognition will decrease. We call this possibility the shifted critical region hypothesis. Second, people may reason that the target must be sufficiently similar to both the outdated and age-progressed images. The age-progressed image will thus narrow the critical region (since the target face will be perceived as necessarily existing in the overlap of the individual critical regions of the two images), reducing the likelihood of correctly recognizing the target. We call this the contracted critical region hypothesis. Third, people may reason that the target must match either the outdated image or the age-progressed image (or both). The age-progressed image will therefore increase the critical
region, increasing the number of competing non-target faces that fall within it, making it more difficult to identify the specific target. We call this the expanded critical region hypothesis. Results from Study 1 cast doubt on the shifted critical region hypothesis since, if true, participants in the outdated + ageprogressed condition would have simply ignored the outdated image, meaning that their target recognition scores and false recognition scores should have been equivalent to those of participants in the age-progressed condition. Clearly, however, these scores differed. Furthermore, results also cast doubt on the contracted critical region hypothesis since this hypothesis cannot explain why the age-progressed image increased non-target identifications. The data are, however, consistent with the expanded critical region hypothesis. Specifically, the addition of an age-progressed image increased the critical region, which increased the number of nontarget faces that fell within in. Because more faces were now competing with the target’s face for recognition, this resulted in a lower target recognition score and an inflated non-target recognition score. However, this reasoning is somewhat indirect because any given participant in Study 1 provided only a single response per target – they either identified an array member as the target or not. Thus we cannot definitively support the expanded critical region hypothesis’s more specific prediction that an age-progressed image shifts a specific observer’s confidence from the target to the non-targets. A more direct test of this specific hypothesis is to examine whether for a given participant, the presence of an age-progressed image results in decreased confidence in the identity of the actual target, and whether that is accompanied by increased confidence in the identity of non-targets or increased confidence that the target is not in the array at all. This is the goal of Study 2. 3. Study 2 Study 2 replicated Study 1 using a methodology whereby participants assigned points among the various array members (and a “not there” option) according to their confidence that each member was the target (or no one was the target). If the presence of an age-progressed image reduces target recognition by contracting observers’ critical region, its presence should lead to a lower probability of the target being captured by the critical region, and to confidence points being shifted from the target (in the outdated image condition) to ‘not there’ responses (in the outdated + ageprogressed images condition). On the other hand, if the presence of an age-progressed image reduces target recognition by expanding the critical region, its presence should lead to more non-target faces being captured by the critical region, and to confidence points being spread from the target (in the outdated image condition) to the non-targets (in the outdated image + age-progressed image condition). 3.1. Method 3.1.1. Participants Participants were 231 college undergraduates (67% female, 77% Hispanic; Mage = 20) from a large southern university. 3.1.2. Stimuli Stimuli were the same as in Study 1. 3.1.3. Procedure The procedure for Study 2 was identical to that of Study 1, except for the dependent measure. For each target, instead of making an identification (or not) of the target from its corresponding array, participants assigned 100 total points to the four array members and a ‘not there’ option, according to their confidence that the given
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
175
array member was the target (or that the target was not in the array). This procedure was repeated for each of the five targets. 3.2. Results For each target array we calculated the mean number of confidence points assigned to (1) the target, (2) all of the non-targets, and (3) the ‘not there’ option. Results for each target are shown in Table 2; however, because we were not concerned with individual targets, average confidence points were collapsed across target for analysis purposes; these are displayed in Fig. 3.
Assigned confidence points
Fig. 2. Two-dimensional analog of multidimensional face space. Dimensions represent physiognomic facial features. Circles represent a perceiver’s decision criterion for a target face (which would be represented as a point inside each circle). The area bounded by the circle represents a perceiver’s critical region – the region within face space in which a new face would be considered the same person as the target face. The top panel shows a hypothetical critical region surrounding an outdated image. The bottom panel represents three different possibilities for how the critical region might change with the addition of an age-progressed image (represented by the circle in the upper-right quadrant). The shaded region is the new critical region.
60
Outdated image Outdated + age-progressed images
50
Age-progressed image
40 30 20 10 0 Target
Table 2 Mean confidence points (SDs in parentheses) assigned to target, non-targets, and ‘not there’ responses in Study 2 as a function of individual target and image condition Image condition. Image condition Outdated
Outdated + age-progressed
Age-progressed
Target confidence points Target 1 64.0 (27.4) Target 2 52.9 (28.7) Target 3 13.6 (18.9) Target 4 57.9 (29.9) Target 5 36.4 (27.3)
60.4 (24.5) 46.2 (29.5) 13.6 (17.0) 48.0 (31.8) 30.3 (27.8)
53.7 (26.6) 25.9 (26.2) 17.0 (20.1) 26.7 (28.1) 21.6 (24.3)
Average
39.7 (15.2)
29.0 (13.0)
Non-target confidence points Target 1 32.4 (25.7) Target 2 35.5 (27.0) Target 3 38.7 (30.7) Target 4 22.5 (24.1) Target 5 37.4 (25.6)
44.9 (13.8)
36.1 (22.9) 40.1 (26.5) 47.4 (29.5) 32.0 (24.8) 40.3 (26.5)
40.8 (40.8) 64.2 (28.5) 56.1 (28.2) 50.8 (30.5) 42.7 (28.5)
Average
39.2 (14.4)
50.9 (14.0)
‘Not there’ confidence points Target 1 3.7 (4.9) Target 2 11.6 (17.7) Target 3 47.7 (35.4) Target 4 19.6 (23.1) Target 5 26.2 (23.6)
3.5 (6.6) 13.7 (23.9) 39.0 (33.6) 20.1 (24.5) 29.4 (28.0)
5.5 (6.8) 9.9 (16.6) 27.0 (29.2) 22.5 (23.2) 35.7 (29.1)
Average
21.1 (11.5)
20.1 (12.2)
33.1 (15.0)
22.0 (12.3)
Non-targets
Not there
Response Fig. 3. Confidence points assigned to target, non-targets, and ‘not there’ option as a function of image condition for Study 2. Error bars indicate standard errors.
Planned comparisons indicated that, compared to participants who viewed only the outdated image, participants who viewed both the outdated and the age-progressed image assigned fewer confidence points to the target (Ms = 44.9 and 39.7), t(147) = 2.18, p = .03, d = .36, and more confidence points to the non-targets (Ms = 33.1 and 39.2), t(147) = 2.49, p = .01, d = .41. ‘Not there’ confidence points did not differ significantly across these conditions (Ms = 22.0 and 21.1, for the outdated and outdated + age-progressed image conditions, respectively), t(147) = .44, p = .66, d = .07. Participants who viewed only the age-progressed image performed worst of all. Compared to participants who viewed both the age-progressed and outdated images, they assigned fewer confidence points to the target (M = 29.0), t(146) = 4.62, p < .001, d = .76, and more confidence points to the non-targets (M = 50.9), t(146) = 5.03, p < .001, d = .83. They did not assign a significantly different number of confidence points to the “not there” option (M = 20.1), t(146) = .53, p = .60, d = .09. 3.3. Discussion Results replicate Study 1 using a different dependent measure: far from being beneficial, age-progressed images were in fact detrimental, leading people to be less confident that the actual target was the person depicted. This decreased confidence in the
176
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
actual target was accompanied by increased confidence in the nontargets. These results are consistent with the increased critical region hypothesis (which predicted that the new critical region would be larger, thus capturing more fillers), and inconsistent with the decreased critical region hypothesis (which predicted that the new critical region would be smaller, thus capturing fewer fillers). In addition, the finding that participants who viewed only the ageprogressed image performed even worse than participants who viewed both images is inconsistent with the shifted critical region hypothesis. 4. Study 3 Although results from Studies 1 and 2 are consistent with the expanded critical region hypothesis, they are still not quite direct tests of the proposition that the addition of an age-progressed image increases the critical region, as they were not designed to measure the size of the critical region per se, but only the results of a putative change in critical region size. Study 3 tests the expanded critical region hypothesis directly. If correct, then the critical region of people who view both the outdated and age-progressed target images should capture more faces than the critical region of people who view only the outdated image. 4.1. Method 4.1.1. Participants Participants were 88 undergraduate students (65% female; 74% Hispanic; Mage = 20) from a large southern university. 4.1.2. Stimuli Target photographs (outdated and age-progressed) were the same as in Studies 1 and 2. However, instead of being shown a simultaneous array of four faces, participants were shown a sequential series of twelve faces. These faces matched the target in terms of sex, race, hair color, and approximate current age. The actual targets were not among these photographs. 4.1.3. Procedure All stimuli were presented on a computer in a lab setting. Participants were assigned randomly to view either the outdated target image or both the outdated and age-progressed target images for each of five targets. While viewing their respective target image(s), participants were shown twelve non-target faces sequentially for each of the five targets, and indicated whether each one could “plausibly be the target.” 4.2. Results and discussion Results are collapsed across target; our main dependent measure is the total number of faces out of 60 (5 targets × 12 faces/target) that participants believed could plausibly be the target. Consistent with the expanded critical region hypothesis, participants who viewed both the outdated and age-progressed images thought that more faces could plausibly be the target (M = 19.8) than participants who viewed only the outdated images (M = 16.5), t(86) = 2.00, p = .05, d = .43. Results are consistent with the expanded critical region hypothesis: The addition of an ageprogressed image to an outdated image makes the critical region larger, leading observers to perceive more faces as being plausible matches to the target. 5. General discussion Very little empirical data exist on the effectiveness of ageprogressed images in helping recognize missing children, despite
their common use. Furthermore, the few studies that have been conducted have tended to show null results (with the exception of Study 4 from Lampinen, Miller, et al. (2012), which showed a detrimental effect of age-progression), raising the issue of whether age-progressed images truly have no effect on target recognition, or whether the specific paradigm that was used was simply not sensitive enough to detect the effect. Current results add to this tiny literature by examining (1) the effectiveness of a method of age-progression never before studied (computerized images created using APRIL age-progression software) using a methodology specifically designed to be more sensitive to the effects of ageprogression; and (2) how the addition of an age-progressed image changes observers’ decision-making strategies in relation to target recognition. 5.1. Practical applications Unfortunately, our data are alarming, as they suggest that the very techniques that law enforcement are currently using to help recover missing children may actually be making it harder to find them: The addition of an APRIL age-progressed image to an outdated image decreased the likelihood of correctly recognizing the target and increased the likelihood of mistakenly ‘recognizing’ nontargets (Study 1), and decreased confidence that the actual target was the person depicted in the original image and increased confidence that a non-target was the person depicted in the original image (Study 2). These findings are some of the first to show that age-progressed images can, at least in some instances, have harmful effects on target recognition, and highlight an obvious shortcoming with the use of APRIL age-progression software. As such, we correspondingly encourage caution in its use in the recovery of missing children. There are two possible negative costs associated with a recognition error produced by age-progressed images in the real world: An observer may mistakenly ‘recognize’ a non-target (a false alarm) or may fail to recognize an actual target (a miss). But these errors are not equal: The failure to recognize a missing child is much more serious than mistakenly ‘recognizing’ someone.5 Consequently, an age-progression procedure that increased hits would be beneficial, even if it led to a commensurate increase in false alarms. In fact, even if the primary consequence of adding an age-progressed image was an increase in the number of false alarms, there would be relatively little cost, apart from some minimal wasted resources, to using them as long as they sometimes increased hits. The problem is that our results suggest that age-progressed images seem to actually reduce the likelihood of correctly recognizing a missing child. In other words, APRIL age-progressed images were not simply useless; they were in fact worse than useless, leading people away from the actual target. This effect seems to occur because adding an age-progressed image to an outdated image changes observers’ decision-making strategies suboptimally. Specifically, the addition of an APRIL ageprogressed image to an outdated image increases the number of faces that are considered as plausible targets (Study 3), presumably making the target face less likely to stand out, decreasing target recognition and increasing mistaken non-target ‘recognition.’ In fact, this finding is consistent with anecdotal evidence from The National Center for Missing and Exploited Children, which claims: “In virtually every case the production and distribution of an updated [i.e., age-progressed] image stimulates new leads” (OJJDP,
5 Note that this is in contrast to errors often seen in other legal psychology areas, such as the eyewitness field, in which a false alarm (i.e., a mistaken identification) is usually perceived as being a more serious error than a miss (i.e., failing to identify the actual perpetrator; cf., Steblay, Dysart, & Wells, 2011).
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
p 165). Unfortunately, current results suggest that any purported increase in leads may tend to be false recognitions of non-targets. Note that our finding that the addition of an age-progressed image increases the number of faces that could plausibly be the target (i.e., the critical region) is incoherent under the assumption that people respond logically to age-progressed images. This is because an image of a target – whether outdated or age-progressed – imposes constraints on the identity of the target, and thus provides further limits on the size of the critical region. An outdated image of a White female with blue eyes, for instance, constrains the target to being a White female with blue eyes, and all other faces – males, Black females, White females with brown eyes, etc. – are now excluded from the observer’s critical region. Similarly, every subsequent image provided to the observer – such as an age-progressed image, for instance – can only further increase those constraints (or, if the new image provides absolutely no new details of the target, fail to affect the preexisting constraints). Importantly, an additional image of a target can never lessen the constraints, since the observer is still at least as constrained by the existing images. Given this, if people responded logically to age-progressed images, it would be impossible for those images to increase their critical region because that would be tantamount to lessening the existing constraints on the target’s identity. And yet that seems to be exactly what we have shown.6 The resolution of this apparent paradox is simple: People do not respond logically to age-progressed images. Our data suggest that instead of realizing that the target must be a plausible match to both the outdated image and the age-progressed image (or, if the age-progressed image is perceived to be completely worthless, to only the outdated image), people seem to respond to age-progressed images by reasoning that the target must match either the outdated image or the age-progressed image, but not necessarily both. Assuming this effect holds across future studies, it suggests that age-progression techniques may be problematic not only because the algorithms of those techniques are flawed, but also because observers are using information derived from ageprogressed images incorrectly. Researchers should consider this in their future studies when interpreting the effectiveness of various age-progression techniques. If the mere presence of an age-progressed image changes observers’ inherent recognition abilities by increasing the size of their critical region, then this detrimental effect may not simply be a function of APRIL age-progression software per se; rather, it may generalize across an array of age-progression techniques. Nonetheless, the generalizability of these findings across other techniques used to generate age-progressed faces is an open empirical question. Interestingly, however, results from Lampinen, Arnal, et al. (2012) and Lampinen, Miller, et al. (2012) also suggested (but could not definitively demonstrate) that age-progressed images generated by forensic artists may harm target recognition, suggesting that the observed results may not simply be a function of one particular age-progression technique or another. For exploratory purposes we repeated Study 3 using ageprogressed images of actual missing children produced by forensic artists. Results indicated that participants who viewed the
6 This finding – that an additional constraint increased the number of plausible faces – bears some similarity to the conjunction fallacy, in which a specific condition – one with greater constraints – can be perceived as being more probable than a more general condition, which is mathematically impossible (Tversky & Kahneman, 1983). However, the reasons for these effects seem to be different. The conjunction fallacy is usually explained in terms of the representativeness heuristic, in which the likelihood of an event is judged based on its resemblance to a typical member of a category (which can be subjectively greater when there are greater constraints). We have explained the current observed effect, on the other hand, by reference to a critical region expansion.
177
outdated + age-progressed images thought there were more faces that could plausibly be the target (M = 24.7) than participants who viewed either only the age-progressed image (M = 20.6) or only the outdated image (M = 22.6). These results are consistent with the expanded critical region hypothesis, and thus suggest that these age-progression images may also be detrimental for similar reasons. We realize this paints a pessimistic picture of the effectiveness of age-progressed images, but we do not wish to overstate this pessimism for two reasons. First, both the correct identification score (in Study 1) and the assignment of confidence points to the actual target (in Study 2) were well above chance for most of our targets among participants who viewed only an outdated image of the target, despite the approximately 15-year difference between the images. That people can recognize a target at levels above chance despite a fairly large difference in age between study and test images has been noted by both Lampinen, Arnal, et al. (2012) and Seamon (1982). The good news, then, is that at least in some cases, an age-progressed image may be an unnecessary embellishment. Second, at least part of the observed detrimental effect of ageprogressed images is likely to be a function of the low quality of the aging algorithms that underlie the specific age-progression software used in the current studies, or the low quality of forensic artists’ age-progressed images. At least in theory, age-progression systems that use better, empirically supported algorithms (or forensic artists whose age-progression abilities are guided by validated principles) may be able to produce better current likenesses of a missing child. Whether it is possible to create age-progression systems that improve human observers’ abilities to recognize a target over substantial time periods is unknown, but work is currently underway to generate such empirically validated age-progression systems (e.g., Gibson et al., 2009). Nonetheless, current results speak to the importance of extensive empirical psychological testing of any such technique that purports to aid in the recognition of missing children. Until that testing is done, it should not be assumed that these techniques are necessarily helpful.
Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/ j.jarmac.2012.07.008.
References Allen, E. (1990). Computerized photo aging and the search for missing children. International Criminal Police Review, 4–10. Dawes, R. M., Faust, D., & Meehl, P. E. (1989). Clinical versus actuarial judgment. Science, 243, 1668–1674. Gibson, S. J., Scandrett, C. M., Solomon, C. J., Maylin, M. I. S., & Wilkinson, C. M. (2009). Computer assisted age progression. Forensic Science, Medicine, and Pathology, 5, 171–184. Lampinen, J., Arnal, J. D., Adams, J., Courtney, K., & Hicks, J. L. (2012). Forensic age progression and the search for missing children. Psychology, Crime, & Law, 18, 405–415. http://dx.doi.org/10.1080/1068316X.2010.499873 Lampinen, J. M., Miller, J. T., & Dehon, H. (2012). Depicting the missing: Prospective and retrospective person memory for age progressed images. Applied Cognitive Psychology, 26, 167–173. http://dx.doi.org/10.1002/acp.1819 McQueen, I. (1989). Computer age: Computer enhanced aging. Police, 33–34(June), 42–43. Office of Juvenile Justice and Delinquency Program (2006). In S. Seidel (Ed.), Missing and abducted children: A law-enforcement guide to case investigation and program management. Alexandria, VA. Ramanathan, N., & Chellappa, R. (2006). Face verification across age progression. IEEE Transactions on Image Processing, 15, 3349–3361. Scandrett, C. M., Solomon, C. J., & Gibson, S. J. (2007). A person-specific rigorous aging model of the human face. Pattern Recognition Letters, 27, 1776–1787. Seamon, J. G. (1982). Dynamic facial recognition: Examination of a natural phenomenon. American Journal of Psychology, 95, 363–381.
178
S.D. Charman, R.N. Carol / Journal of Applied Research in Memory and Cognition 1 (2012) 171–178
Steblay, N. K., Dysart, J. E., & Wells, G. L. (2011). Seventy-two tests of the sequential lineup superiority effect: A meta-analysis and policy discussion. Psychology, Public Policy, and Law, 17, 99–139. http://dx.doi.org/10.1037/a0021650 Thompson, T., & Black, S. (Eds.). (2006). Forensic human identification: An introduction. CRC. Tredoux, C. (2002). A direct measure of facial similarity and its relation to human similarity perceptions. Journal of Experimental Psychology: Applied, 8, 180–193.
Tversky, A., & Kahneman, D. (1983). Extension versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293–315. http://dx.doi.org/10.1037/0033-295X.90.4.293 Valentine, T. (1991). A unified account of the effects of distinctiveness, inversion, and race in face recognition. The Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 43A, 161–204. Wells, G. L., & Windschitl, P. D. (1999). Stimulus sampling and social psychological experimentation. Personality and Social Psychology Bulletin, 25, 1115–1125.