Changes in the own group bias across immediate and delayed recognition tasks

OGB in face recognition, there is little consensus, apart from the acknowledgement that the bias must reflect perceptual learning history. One matter that is not currently clear is whether the bias occurs at encoding, or at retrieval from memory. We report an experiment designed to tease out bias at encoding, versus bias at retrieval. Black and white South African participants encoded 16 target faces of both the same and other race and gender, and attempted immediately afterward to match the target faces to members of photograph arrays that either contained or did not contain the targets. After a further delay, they attempted to identify the faces they had encoded from memory. Results showed a strong crossover OGB in the delayed matching task, but an asymmetrical OGB at retrieval (only white participants showed the OGB). Further investigation of recognition performance, considering only images correctly matched in the delayed matching task, showed a narrowly non-significant OGB at retrieval, but the investigation was likely not sufficiently powered to discover the effect, if it exists.


Introduction
The own group bias (OGB) 1 refers to a recognition advantage for faces of members of one's own group. The most common formulation posits an unequal advantage: groups that co-occupy a common environment will show an advantage for members of their own group over members of another group. However, the recognition advantage is typically asymmetrical in countries or regions where demographic or economic representation is unequal (e.g. the United Kingdom 2 ). Members of the dominant group will usually show an OGB, but members of the subordinate groups often do not show it. 2,3 The practical implications of the OGB are profound. The Innocence Project in the USA 4 has shown, for instance, that over 40% of exonerations of people falsely imprisoned based on eyewitness identifications were convicted in part as a result of cross-race eyewitness identifications -a figure far out of proportion to the demography of the USA.
Explanations of the OGB in the literature are plentiful, but it is probably fair to say that no single account has won out empirically or theoretically. All theories appear to accept the notion that the OGB must be a consequence to some degree of differential exposure to one's own and other group faces. 5 Our interest is in what might be called the micro-chronology of the OGB. Whereas the OGB is often referred to in the literature as a memory bias -for instance Hugenberg and colleagues declaring their intent to "…explain the proliferation of own group biases in face memory" 6(p.1392) , and Yaros and co-authors 7 arguing that the OGB may emerge due to "tuned" memory mechanisms -researchers have pointed out for a number of years that the bias may stem from the differential ability to encode faces of other groups, rather than a reduced ability to recognise or retrieve them from memory 8 . Recent claims are that the OGB is likely due to the less efficient encoding of out-group faces in visual working memory 9,10 , or to preferential distribution of attention during encoding 11 .
Megreya and co-authors 12 have published research that questions the basis of the OGB in memory processes, arguing instead that the OGB may be entirely due to encoding difficulties. Participants in Megreya et al.'s study had difficulty in matching the face of a 'target' in a film still to digital photographs of the same person. This finding was more pronounced when participants attempted to match faces from a group different from their own. This outcome has important implications, especially for theoretical explanations of the OGB bias: many theoretical accounts have focused on memory retrieval operations or are at least unclear on the relative role of encoding and retrieval in the provenance of the OGB, and this should be addressed if the phenomenon to be explained is entirely about encoding rather than retrieval. But this may be somewhat premature: at this stage in the development of thinking about the OGB, it is not clear what the relative contribution of each process is. Indeed, several authors have offered evidence that either questions whether the deficit is in any way encoding based, or that posits that it is in part retrieval based. Papesh and Goldinger 13 were able to show that disruptions during retention intervals modulated the OGB for outgroup faces, but not for in-group faces. Stelter et al. 14 showed that eye movement activity differed across in-group and out-group targets during a recognition, but not an encoding phase. They also showed that the OGB was a function of poor performance in recognition of new faces specifically.
Our aim was to explore the relative contributions of encoding and retrieval in the manifestation of the OGB in face recognition. It is important at the outset to acknowledge that it is difficult to disentangle encoding from recognition, and in this respect the difficulty appears to lie primarily in establishing that a stimulus has been encoded. We offer one approach, which is to test recognition with a delayed matching task, and then after a longer period with a recognition memory task. In a delayed matching task, participants view a stimulus, and then immediately attempt to match the stimulus in a test array once it has been removed from view. If participants are accurate in the delayed matching task, this suggests successful encoding. Note that Megreya et al. 12 did not use a delayed matching task, but an in-view matching task. In other words, they attempted to match a target image to one embedded in an array of images while both the target and the array were in full view. However, we believe that in-view matching tasks do not demonstrate encoding as clearly as delayed matching tasks because matching targets in-view might be accomplished without encoding faces at all. Participants may be able to use non-face properties of stimuli, such as colour temperature, or other image artefacts to effect the match, or might be able to do so with minimal encoding of the target image, e.g. by relying on specific, transient and localised characteristics of faces, such as skin blemishes, and hairstyle oddities. This distinction between face identity encoding and face image encoding was powerfully made over 40 years ago. 15 For this reason we departed from the procedure used by Megreya et al. 12 There were some further departures, too, that we thought important. In the first instance, we wished to see whether the results they obtained replicate across different viewpoints and with different stimulus materials. They used only frontal views of participants, in both matching and target views, and they also appeared to have made their matching arrays very difficult (the faces in the array were the most similar in a moderately large set). Repeating their experiment with different viewpoints of the faces (e.g. in profile vs. frontal views) would test whether the memory trace of the face is well enough encoded to match and recognise it over different transformations (i.e. different face poses). Furthermore, repeating their experiment with line-up arrays that are constructed around faces that are not specifically selected for high similarity seems important to us, especially as one of the most significant potential applications of knowledge about the OGB is in the eyewitness domain, where line-ups are unlikely to be constructed in this way (see Wells et al. 16 for a discussion of why this is not a good idea).
However, the most important goal in our experiment was to extend Megreya et al.'s 12 original experiment and to attempt to separate out encoding and retrieval processes, and to estimate the relative importance of each in the manifestation of OGB bias. Thus, in our experiment, participants were tested for delayed face matching ability, and, after a suitable delay, for their ability to recognise the face to which they were exposed in the delayed matching task. We used a line-up task, designed to permit the calculation of the signal detection theory (SDT) measures d' (discrimination) and c (criterion). Through this, and by calculating successful recognition of faces that were incorrectly matched in the first part of the procedure, we attempted to assess the relative contributions of encoding and retrieval deficits in the OGB effect.

Target stimuli
Stimuli were digital images taken from a large database of black South African and white South African faces (containing multiple views of over 1000 individuals) and curated by the first author. Sixteen target faces were used in this experiment: four male and four female black South African faces, and the same again for white South African faces. The target faces were randomly sampled by the authors from the total collection, and target ratings were obtained to ascertain whether there were any idiosyncrasies present in the target images that may have resulted in them being more memorable than the other target faces. A total of 22 (15 female) (Mage = 20.52 years; SDage = 1.21 years) participants rated the target faces on a number of dimensions that are known to affect face recognition: typicality, distinctiveness, attractiveness, perceived criminality, age, wealth, memorability, and familiarity. Each face was rated on a scale from 0 (not at all) to 8 (extremely), whereas age was estimated numerically. Two images of the same target were presented side-by-side during the rating task: one photograph was of the target in a neutral three-quarter side pose, the other image was of the target in a frontal casual/smiling pose. Each image pair was presented one at a time, along with a randomised order of rating dimensions. The size of each face image was approximately 7.94 cm in width and 10.35 cm in height, with a resolution of 300 x 391 pixels (8-bit). The image background and the target clothing were edited to be standard and consistent across all the target images.

Line-up construction
Target present (TP) and target absent (TA) photographic line-ups were constructed for the target images. TP line-ups contained the target and five foils. In the TA line-ups, the target was replaced with a foil that was randomly selected by the authors, resulting in a six-foil line-up (i.e. there was no designated suspect). No foils were repeated or appeared in any of the other line-ups. The line-up members were selected to be subjectively moderately similar in appearance to the target (i.e. following the principle set down in Wells et al. 16 that line-up members should be matched to the target on some but not all characteristics). Three of the present authors independently selected ten possible foils for each target face from a database of hundreds of black South African and white South African faces. The most frequently chosen foils were used as the final images in the line-ups.
Corresponding to the 16 target faces, each participant saw 16 line-ups. The line-ups appeared in one of two orders: eight of the line-ups were TP line-ups, and the other eight were TA line-ups. Two line-ups were created for each target face: a frontal pose line-up, and three-quarter pose lineup. The frontal neutral pose and the three-quarter pose were used to control for picture recognition and to ensure that identification was made on memory for the target and not the target photograph. The line-up photographs were in colour and standardised. The backgrounds were edited to remain consistent across all the images. All clothing, jewellery, and distinctive markings were digitally removed from the images. The size of each face image was approximately 6 cm x 7.8 cm, with a resolution of 227 x 296 pixels (8-bit). The image background and the target clothing were edited to be consistent across the target images.

Participants
A total of 64 (53 female) participants from the University of Cape Town participated in the study in exchange for course credit; 32 (50%) of the participants identified themselves as white South Africans, and 32 (50%) of the participants identified themselves as black South Africans. All participants reported normal or corrected to normal vision.

Encoding and delayed matching phase
The study was conducted at the University of Cape Town, in a quiet computer laboratory. The experiment was presented on computer using E-Prime 2.0, at a resolution of 1024 x 768 pixels.
After providing demographic information, participants were informed that they would be presented with a series of target faces, one at a time, for five seconds. One of the target faces appeared on the screen, in a frontal casual/smiling pose. The size of the target image was 12 cm x 15.7 cm, at a resolution of 456 x 594 pixels.
After five seconds, the face disappeared, and a six-member simultaneous TP or TA line-up was immediately displayed, in a delayed matching task. Participants either saw a three-quarter line-up or a frontal line-up. Participants were informed that the target they had just seen may or may not be present in the line-up. They were cautioned that the clothing and background of the target image may be different from the image they had studied. If participants recognised the target, they had to indicate the corresponding number above the line-up member on the keypad; if they thought the target was not present, they had to indicate '0' on the keypad. This delayed matching procedure was repeated for the When the line-up delayed matching phase was finished, participants completed a filler task for 10 minutes (a Sudoku game).

Recognition phase
Following the distractor task, participants were told that they would view a series of six-member line-ups and that their task was the same as before -to select the target face that they had studied earlier. They were informed that the line-ups may be different from the line-ups they had seen in the delayed matching session. Participants were again cautioned that the target face may or may not be present in the line-up. It was emphasised that the line-ups would appear in a random order and would not necessarily correspond with the order in which the participants had studied the target faces.
Participants attempted to identify the targets from the same line-ups used in the delayed matching procedure. For example, if they saw a TA three-quarter pose line-up at delayed matching, they saw the same line-up at recognition. The position of the line-up members was rearranged between delayed matching and recognition to guard against commitment effects -if participants recognised the line-up from the delayed matching session, they may have been tempted to select the same line-up position as before.
If participants recognised the target, they indicated the corresponding number above the line-up member on the keypad; if they thought the target was not present, they selected the appropriate option on the keypad. This recognition procedure was repeated for the remaining 15 line-ups; the order of arrays was randomised across participants.
When all recognition trials were completed, participants who responded '0' (not present) to a TP line-up -i.e. who had incorrectly rejected the line-up -were given another chance to respond to the line-up. They were shown the same line-ups to which they had responded 'not present' and were then required to make a selection (i.e. a forced choice; however, data from the forced choices were not included in the present analyses). The order of these line-ups was randomised.
Once participants had completed their forced-choice line-up tasks, they were thanked, debriefed, and dismissed from the study. Please note that further details regarding preparation of stimuli, as well as descriptive statistics of task performance, appear in Figure 1.

Results
Hit and false alarm scores were computed for each participant and were used to derive discrimination (d') and criterion (c) signal detection measures. Because we did not use designated suspects in our TA lineups, false alarms were computed as the total number of identifications of foils in TA line-ups. (Although eyewitness researchers often divide the total number of foil identifications to estimate the false alarm rate, this constrains the false alarm rate to a maximum of 1/number_of_line-up members. This may be appropriate for a perfectly fair line-up, but very few line-ups are perfectly fair. 17 We have not divided by the number of foils. The consequence is that false alarms are more probable than hits, and this can be expected to decrease d', but as long as one recognises that it is comparisons across d' and c that are important, not absolute levels, it should make little difference. Judging absolute levels might lead one to believe that participants in this study had poor discrimination in some conditions (indeed, d'<0 in some conditions), but that is not a real reflection of ability).
Our analyses are generally based on these SDT measures, rather than on raw hit, false Alarm, or other untransformed accuracy measures. Figure 2 shows raw measures for the information of interested readers.
We conducted a four-way linear mixed model analysis of the experimental data, that is stimulus group (black vs. white) X line-up face view (frontal vs. three-quarter) X task condition (delayed matching vs. recognition) X participant group (black vs. white), using the package LME4 18 , within R 19 . 'Stimulus group' and 'task condition' were within-participant factors. The analysis of variance revealed several significant effects, as shown in Table 1.
At the level of main effects, we observed differences in d' across 'line-up face view', and for 'task condition' (p<0.05). We found twoway interactions for both 'task condition and stimulus group', and for 'stimulus group and participant group'. Finally, we found three-way interactions for 'participant group, stimulus group, and task condition', as well as for 'stimulus group, task condition and line-up face view'.
The most important findings from the analysis for our concerns here are the classic two-way interaction of 'stimulus group and participant group' (the OGB effect), and the three-way interaction of 'stimulus group,    A classic bias in face recognition ability is for own group members to recognise faces belonging to their own group with greater facility than members of other groups. Megreya et al. 12 have argued though that this facility is also present when participants are asked to match faces to arrays: own-group faces are matched more accurately than out-group faces. This raises the question of whether there is a retrieval component at all in the own-group bias in face recognition -the effect could be entirely due to poorer encoding of out-group faces. In the present experiment, we observed the classic crossover interaction of stimulus and participant group, on d', as shown in Figure 3.  20 ), and that black South Africans performed better with black South African faces than with white South African faces (t(60)=3.71, SE=0.14, p<0.001, Cohen's d=0.54). As we were interested in testing whether viewing faces in frontal vs three-quarter view would impact the OGB, it is worth noting that the 'view' factor was not implicated in any interaction involving either participant group or stimulus group, but it was involved in a significant interaction with task condition, showing that participants found it difficult to match or recognise faces across frontal and three-quarter views, but this effect was greater for the delayed matching condition than for the recognition condition (∆d'=0.89 vs 0.34).  Most pertinent to our concerns was the three-way interaction of participant group, stimulus group, and task condition -or in other words, the two-way interaction of participant group and stimulus group (the OGB effect) considered as a function of which task was being completed. As Table 1 shows, this effect was significant, and Figure  4 shows the effect graphically. Follow-up contrasts showed that, in the delayed matching task, both groups of participants were better at matching their own-group faces: black South Africans were better at matching black South African faces than white South African faces vs SDblack-white=0.56). The three-way interaction between participant group, stimulus group, and task condition was narrowly not significant (F(1, 60)=3.91, p<0.054). We show the marginally non-significant three-way interaction in Figure 4, where it seems that (1) there is a typical OGB for criterion present, and that (2) the OGB may be apparent in the delayed matching task, but not in the recognition task.
Thus far the results have shown a clear crossover OGB in the delayed matching task, as well as an asymmetrical OGB in the recognition task. It is not clear that the OGB in the recognition task can be said to be independent of that in the delayed matching task -that is, it seems evident that faces that are not matched or recognised accurately in the delayed matching task will not be recognised accurately in the later task either. Failure in the matching task may also be an indication of encoding failure. In order to test the point that Megreya et al. make about the possible entire dependency of the OGB on encoding processes, we reduced our data set to only those faces that had been accurately matched in the delayed matching phase and re-analysed the recognition data for just those faces. The idea here was to choose faces that we were confident had been encoded successfully.
One implication of selecting only faces that were correctly matched in the delayed matching task, is that the OGB shown in the delayed matching task was effectively controlled for -as all faces were correctly matched, there could be no bias. This reduction left us with 78% of our original participants (selecting those who made at least one correct decision at delayed matching), but as some participants performed better than others in the delayed matching task, this resulted in an uneven distribution of stimuli across conditions. Thus, to take the extreme conditions, whereas 86% of the original stimuli were taken into account for black participants viewing white South African faces in TP conditions, 46% of stimuli were taken into account for white participants viewing black South African faces in TA conditions. This 'attrition' undoubtedly affected the potential power of analyses of recognition of stimuli that had been successfully matched in the delayed matching task.
A linear mixed model testing discrimination (d') for faces correctly matched in the delayed matching task showed a non-significant effect for the key two-way interaction of interest (between participant group and stimulus group), that was significant in the earlier model (F(1, 49.35)=2.80, p<0.104). Although the interaction effect was not significant, it seemed appropriate to us to follow up with focused contrasts directly exploring a potential OGB, and these showed that, in the recognition task, testing recognition of only faces successfully matched in the delayed matching task, neither white nor black South Africans were better at their own-group faces, although this was narrowly not the case for white South Africans (for black p<0.059, Cohen's d=0.48). While the asymmetrical OGB seen in the recognition test using all stimuli was not technically significant in the recognition test using only faces successfully encoded (matched in the delayed matching task), it was very close to being so. Of course, this analysis is based of necessity on a smaller sample, and has less statistical power, so it is not strong evidence that controlling for the OGB in the matched delay task (our operationalisation of encoding) eliminates the OGB at recognition. We also conducted a mixed linear model analysis on criterion scores, but did not find any significant results, or any suggestion of an effect (all p>0.48).

Discussion
The OGB in face recognition is a well-established phenomenon, with serious consequences in applied contexts, especially when recognition is treated as person identification, as one finds in law enforcement procedures. Despite often being the subject of empirical and theoretical investigation, not much is known about the cognitive processes underlying the phenomenon. One important question concerns whether the OGB is an encoding or a recognition phenomenon, and we have brought results from an empirical study to bear on this question, or at least on the micro-chronology of the OGB.
Black and white South African participants in our study were asked to match black or white target faces to corresponding images in arrays of same group faces immediately after viewing them. They were then asked, after an intervening period in which they completed a distractor task, to recognise the target faces from memory, in the same arrays. Our results show a strong OGB in discrimination accuracy for both black and white participants at the delayed matching phase of the study, but only white participants showed the bias at the recognition memory phase of the study. We found similar results for the measure of response criterion that we computed, although the result for the important interaction of participant group, stimulus group, and task condition was narrowly not significant. Our finding of an asymmetry in the delayed recognition task for discrimination, rather than in the delayed matching task, is partly novel, and partly in line with extant research. Most studies that have been conducted on the OGB (for race) in South Africa have reported an asymmetric OGB at recognition. Usually, white participants show a strong OGB, whereas black participants rarely show an OGB at all (but see Wittwer et al.'s Study 6 21 for an example of a crossover), and sometimes show an inverse OGB, recognising white faces with more accuracy than they do black faces 22 ). Some authors, including ourselves, have ascribed this to the way in which the social and political context have structured intergroup contact in South Africa (and other countries with similar intergroup histories), and therefore perceptual contact. It is not easy, though, to explain the finding of a crossover OGB among black participants at encoding (the delayed matching task) that fails to materialise in the recognition test. One possibility is that performance on the recognition task bottomed out, approaching basement level, but this does not seem to have occurred differentially for black vs white participants, and is not a convincing explanation. It is possible that it could reflect a real difference in memory consolidation of out-group faces by black participants. What does seem clear to us is that the argument that the typical asymmetric OGB between majority and minority groups is due to socio-political factors -e.g. that social and political inequality might impact perceptual learning of out-group faces -is made more difficult to sustain by our results. If we had considered only data from the recognition task we might well have reported another instance of an asymmetric OGB, but in looking at the encoding/delayed matching task, we noted a crossover OGB. A source of evidence that is sometimes marshalled in an attempt to understand the OGB as it manifests in particular contexts is the perceptual contact history of participants. Methods of assessing past perceptual history generally rely on self-reporting, and in the metaanalysis reported in Meissner and Brigham 1 were only poorly correlated with the OGB (r=0.13). We did not collect self-reported data on perceptual contact history, but do not believe it would have shed any light on the nature of the OGB in our study, given its poor predictive record reported in Meissner and Brigham 1 . We thus do not have a good sense of why we observed an OGB at encoding but not at recognition for black participants.
An anonymous reviewer has suggested that it could be a task-dependent difference, possibly interacting with social-motivational factors, which we think is possible. On the more general point of why there is often an asymmetric pattern for the OGB, with disadvantaged groups showing a weaker or non-existent OGB when compared to advantaged groups, we think that Malpass' 23 notion of a 'social interaction utility' is useful: members of the disadvantaged group have a positive utility associated with interaction and person recognition (often being economically dependent on the advantaged group, it is important to interact with, and recognise members of that group), whereas members of the advantaged group have a negative utility (there is no clear advantage to interaction and recognition).
An important question in research on the OGB concerns the degree to which the OGB is a function of encoding, or of recognition. The use of delayed matching and recognition tasks allowed this to be addressed in our study. We conducted a second analysis on images correctly matched (therefore showing no OGB), and although we did not find an OGB for these images at recognition using a statistical significance test as the criterion, the size of the effect we observed in this task (d=0.48) was very similar to that observed in the recognition task (d=0.52). Our results likely have equivocal bearing in the end on whether the OGB is an encoding phenomenon, rather than a recognition memory phenomenon. What is most interesting about our findings, though, and which has nothing to do with the final task we used, is that we observed a strong crossover OGB at delayed matching, which is not at all usual in South African studies, and after a brief delay of 5 minutes, the OGB manifested in a recognition memory task as an asymmetrical effect, which is typically the form other studies in South Africa have previously reported.
The claim that the OGB is an encoding phenomenon makes sense from a perceptual learning viewpoint, especially in line with Valentine's multidimensional face space model 24 , and his explanation of the OGB in terms of the model 25 : the dimensions available to represent out-group faces are fewer, and less well developed than for in-group faces, and encoding of out-group faces will be less well differentiated from other exemplars of that group. Of course, our method for separating encoding and recognition processes is admittedly ad hoc: we tested recognition of faces that had been correctly matched, thus attempting to control encoding processes. We did not directly show that own-group faces were better encoded than out-group faces, although in demonstrating an OGB at encoding we think that differences were implicitly demonstrated. Face encodings are complex patterns of electrical activation across multiple brain regions, and not directly accessible, although there are some clues to the neural underpinnings of the OGB and its connections to socio-affective processes. 26 Some researchers have shown differential event-related potentials to own-and other-group faces 27 , with a potentially special role for the P200 visual component of the event-related potentials 28 . These investigations suggest that we are making progress toward a better understanding of the brain mechanisms underlying the OGB.
Apart from the obvious limitation of not being able to assess encoding proficiency directly, there are other limits to what we are able to conclude from our study. An important methodological limitation may be our method of checking that a stimulus had been encoded, and the knock-on effect of this for the accuracy of our recognition test. As a reminder, we assessed encoding with a delayed matching task: participants attempted a match immediately after viewing a target stimulus. After a further delay of 5 minutes, we assessed recognition memory by asking participants to choose the target image from a line-up, which either contained or did not contain the target. Our recognition test could thus be said to have exposed participants to the target stimulus twice, and this taints the comparison between encoding and recognition in which we are interested. In other words, although the delayed matching task tests memory for a stimulus seen once, the recognition task tests memory for a stimulus seen once on its own, and once within an array. It is important to note that participants were not told the position of the target in the array in the delayed matching task, and that the photographs of the target changed from the original presentation (frontal casual/smiling view) to that in the line-up arrays used in the delayed matching and recognition tasks (three-quarter profile, or neutral/passport style view). It is still possible that there was some strengthening of the memory assessed in the recognition memory task, but our results show that memory performance decreased considerably between the delayed matching task and the recognition memory task (see Figures 3 and 4), so it does not seem to have counteracted that decrease, if at all. It would also have been unlikely to affect our white and black participants differentially, and because our key effect of interest was the nature of the OGB at the two time points, it does not appear to us to be confounded with the two-or three-way interactions in which we were interested.
Whereas we studied matching and recognition of two-dimensional images of faces, face recognition in real contexts is of three-dimensional surfaces that change locally, and globally, over time, in interaction with perceivers. We also used a short delay period between encoding (matching), and recognition, unlike many face recognition tasks 'in the wild'. However, it is likely that the OGB would be higher in more naturalistic contexts, because there are known additional biases across groups for emotion 29 and age 30 , among other biases, and we do not see a reason to believe that this would differentially affect encoding and recognition deficits already present. There is an extensive and convincing argument in support of the view that one expects real witnesses to perform far worse than participants in laboratories. 31 In conclusion, although we have not reported clear evidence in favour of the view that the OGB in face recognition is likely a consequence of poor encoding of other group faces, we have identified an interesting, rapid change in the manifestation of the OGB. Whilst it took a strong crossover form at encoding (delayed matching), it reverted to an asymmetric form after a brief delay. There are some applied implications of this result, although the evidence is at this stage too slight to base strong recommendations on. If the OGB is a failure of encoding then there may be little justification for devising methods that focus on improving recall or recognition of out-group faces, and it may be better to develop training programmes that focus on encoding processes. There is some evidence that in-group members focus on different face regions when encoding out-group faces than out-group members do 32 , although it is not yet clear that reshaping cross-group encoding will work 33 . On the other hand, if the OGB is a failure of retrieval, some methods such as the Cognitive Interview 34 and the Person Description Interview 35 may be useful interventions when recovering information from memory about face appearance and identity -although it should be borne in mind that these methods are typically good for improving recall memory, but not recognition memory 36 . It could also be that the OGB is a function of both encoding and retrieval processes, and we are presently considering ways of adapting our procedure to accommodate this possibility.

Ethical considerations
Ethical clearance to conduct the study was granted by the Department of Psychology Ethics Committee of the University of Cape Town. All participation was voluntary and informed consent was obtained before study commencement.

Competing interests
We have no competing interests to declare.