An assessment of the likely impact of strain-related phenotypic plasticity on hominin fossil species identification

It has been proposed that strain-related phenotypic plasticity may be a major confounding factor in attributing hominin fossils to species. The study reported here tested this hypothesis with craniometric data from the great apes and Colobus guereza. We divided the measurements into three groups: measurements of features subject to high masticatory strain, measurements of features subject to low-to-moderate masticatory strain, and measurements of features that do not remodel and therefore are not prone to strain-related phenotypic plasticity. Next, we used the coefficient of variation and ANOVA to investigate whether masticatory strain is a cause of variability. These analyses partially supported the hypothesis. The predicted differences between the high-strain measurements and the other measurements were found in the majority of the species. However, the coefficient of variation values for the low-to-moderate strain and non-phenotypically plastic measurements were indistinguishable. Thereafter, we used discriminant function analysis to compare the ability of the three groups of measurements to assign specimens to species. This analysis did not support the hypothesis. The high-strain measurements were less effective than the other measurements, but the low-to-moderate strain measurements were more effective than the non-phenotypically plastic measurements. In addition, better discrimination was achieved when all the measurements were employed than when just the non-phenotypically plastic measurements were utilised. We conclude from this that strain-related phenotypic plasticity is unlikely to impede hominin alpha taxonomic research.


Introduction
Phenotypic plasticity is the expression by a genotype of different phenotypes in response to different environmental conditions. 1 Recently, Wood and Lieberman 2 hypothesised that this phenomenon may negatively impact efforts to delineate species in the hominin fossil record.Wood and Lieberman's 2 suggestion is based on work examining how mechanical loading affects bone.4][5][6][7] For example, mechanical loading experienced during development has been found to affect both the growth of cortical bone in diaphyses and the growth of trabecular bone in epiphyses. 8Likewise, studies of individuals experiencing lower than normal mechanical strains (e.g.those following denervation, bed-rest or exposure to gravity-free high strain measurements and the non-phenotypically plastic measurements.However, the coefficient of variation values for the low-to-moderate strain measurements and the non-phenotypically plastic measurements were statistically indistinguishable.
Collard and Lycett's 10 discriminant function analysis did not support Wood and Lieberman's 2 hypothesis.The high-strain measurements correctly classified 94.9% of specimens, the low-to-moderate strain measurements 97.5% and the nonphenotypically plastic measurements 97.4%.When all 60 craniodental measurements were included, 100% of specimens were correctly classified.Thus, the high strain measurements were less effective than the other two sets of measurements, as Wood and Lieberman's 2 hypothesis predicts.However, not only were the low-to-moderate strain measurements more effective than the non-phenotypically plastic measurements, but also better discrimination was achieved when all measurements were employed than when the non-phenotypically plastic measurements were utilised.Based on these results, Collard and Lycett 10 concluded that, while phenotypic plasticity likely contributes to the variation observable in the hominin fossil record, controlling for phenotypic plasticity is probably an unnecessary course of action for researchers attempting to group fossil hominin specimens into species.
Here, we describe a second study that explicitly tests Wood and Lieberman's 2 hypothesis.The study employed measurement data from the five species that were used by Wood and Lieberman, 2 and was carried out to counter the main potential criticism of Collard and Lycett's 10 study, namely that Wood and Lieberman's 2 hypothesis was framed in relation to fossil hominins and the taxa used by Collard and Lycett, 10 the Old World monkeys, are too distantly related to hominins to represent a reasonable test case.

Materials and methods
The dataset comprised values for 36 measurements recorded on 37 Gorilla gorilla (20 males, 17 females), 75 Homo sapiens (40 males, 35 females), 35 Pan troglodytes (13 males, 22 females), 41 Pongo pygmaeus (20 males, 21 females) and 24 Colobus guereza (12 males, 12 females).The measurements were selected on the basis of current knowledge of their likely propensity to exhibit phenotypic plasticity as a result of mastication-related strain.Twelve were dental measurements.Since dental enamel does not remodel, these were designated non-phenotypically plastic characters.Labiolingual and buccolingual tooth crown dimensions were used in order to avoid the confounding effect of interstitial wear. 11The other 24 measurements were cranial and mandibular measurements.Twelve of them were included because they relate to features that strain-gauge analyses indi-cate experience strain of at least 1 000 µε during mastication [12][13][14][15] .These measurements were designated high-strain measurements, because strain in the order of 1 000 µε is known to be capable of inducing bone growth. 9The remaining 12 measurements were included because strain-gauge studies indicate they are associated with features of the primate skull that experience strain of less than 1 000 µε during mastication.These measurements were designated low-to-moderate strain measurements.Further details of the measurements are given in Table 1.The data were taken from Wood et al. 16 This was also the source for the data used by Wood and Lieberman. 2Thus, our dataset overlaps with the one they used.The cranial and mandibular measurements were rounded up to the nearest 1 mm, and the dental measurements to the nearest 0.1 mm.
We employed the dataset in two sets of analyses.These were similar to the ones carried out by Collard and Lycett. 10Thus, the first set of analyses focused on Wood and Lieberman's 2 suggestion that strain increases phenotypic variability.We used the coefficient of variation to assess phenotypic variability.A coefficient of variation was determined for each measurement, and then the mean coefficient of variation for each group of measurements was computed.Thereafter, ANOVA with post-hoc least significant difference pairwise comparisons was employed to test for statistically-significant differences among the mean coefficient of variation of the three groups of measurements (α ≤ 0.05).Since ANOVA assumes data are normally distributed, 17 the coefficient of variation was logarithmically transformed (log e) prior to analysis.We reasoned that, if Wood and Liberman's 2 hypothesis is correct, the coefficient of variation for the highstrain measurements should be significantly higher than the coefficient of variation for the low-to-moderate strain measurements, and the coefficient of variation for the latter should be significantly higher than the coefficient of variation for the non-phenotypically plastic measurements.The second set of analyses focused on the second part of Wood and Lieberman's 2 hypothesis, that is the idea that phenotypic plasticity impedes species identification.This was accomplished by separately subjecting the three groups of measurements to discriminant function analysis.The form of discriminant function analysis employed separates groups on the basis of canonical discriminant functions. 18In addition, we carried out a discriminant function analysis in which all the measurements were included.We reasoned that, if Wood and Lieberman's 2 hypothesis is correct, the high-strain measurements should be worse at assigning specimens to species than the low-to-moderate strain measurements, and the latter should be worse at assigning specimens to species than the non-phenotypically plastic measurements.We also reasoned that, if Wood and Lieberman's 2 hypothesis is correct, the analysis in which all the measurements

Research Letters
were used should return more misclassified specimens than the analysis in which only the non-phenotypically plastic measurements were employed.Because Wood and Lieberman's 2 hypothesis is focused on fossil hominins, and fossil hominin specimens can rarely be sexed with confidence, we employed mixed sex samples in both analyses.The analyses were carried out in SPSS 12.0.1; the program's stepwise insertion option was used to conduct the discriminant function analysis.

Results
Table 2 shows the mean coefficient of variation of each variable group for each taxon as well as the results of the least significant difference pairwise comparisons following a one-way ANOVA.The mean coefficient of variation for the high-strain measurements was consistently higher than the mean coefficient of variation for the low-to-moderate strain measurements and for the non-phenotypically plastic measurements.However, the differences were only significant in H. sapiens, P. troglodytes, P. pygmaeus and C. guereza.In G. gorilla the high-strain measurements were more variable than the low-to-moderate strain measurements and the non-phenotypically plastic measurements, but the differences were not statistically significant at the 0.05 level.The low-to-moderate strain measurements were not significantly more variable than the non-phenotypically plastic measurements in any of the species.They were more variable than the non-phenotypically plastic measurements in G. gorilla, P. troglodytes, P. pygmaeus and C. guereza, but none of the differences were statistically significant.In the case of H. sapiens the non-phenotypically plastic measurements were more variable than the low-to-moderate strain measurements although the difference was not significant.Thus, the results of the coefficient of variation analysis were only partially consistent with the prediction that the high strain measurements should exhibit greater phenotypic plasticity than the low-to-moderate strain measurements, and that the latter should exhibit greater phenotypic plasticity than the non-phenotypically plastic measurements.
The first four discriminant functions were used in the discriminant function analyses.Plots of the first two discriminant functions from each analysis are presented in Figs 1-4.The high-strain measurements correctly classified 97.6% of specimens, the low-to-moderate strain measurements 100% and the nonphenotypically plastic measurements 98.6%.When all the measurements were analysed together, 100% of the specimens were correctly classified.Thus, the results of the discriminant function analysis did not support Wood and Lieberman's 2 hypothesis.The high-strain measurements were less effective at assigning specimens to species than the other two groups of measurements, as predicted.However, contrary to expectation, the low-to-moderate strain measurements were more effective at assigning specimens to species than the non-phenotypically plastic measurements.Also contrary to expectation, more specimens were correctly assigned to species when all the measurements were used than when the non-phenotypically plastic measurements were employed.

Discussion and conclusions
The results of the coefficient of variation analysis reported here were comparable to the results of the coefficient of variation analysis carried out by Collard and Lycett. 10As discussed earlier, the latter only partially supported Wood and Lieberman's 2 suggestion that strain increases variability.The predicted differences between the high-strain measurements and the low-tomoderate strain measurements were found to be both present and statistically significant in all seven species, as were the predicted differences between the high-strain measurements and the non-phenotypically plastic measurements.However, the coefficient of variation values for the low-to-moderate strain measurements and the non-phenotypically plastic measurements were statistically indistinguishable.The results of the coefficient of variation analysis reported here also only partially supported Wood and Lieberman's 2 hypothesis.The predicted    differences between the high-strain measurements on the one hand, and the low-to-moderate strain and non-phenotypically plastic measurements on the other, were found to be both present and statistically significant in only four of the five species.In the fifth species, G. gorilla, the predicted differences were present but did not reach statistical significance.In addition, the coefficient of variation values for the low-to-moderate strain measurements and the non-phenotypically plastic measurements were statistically indistinguishable.
The results of the discriminant function analysis reported here were also comparable to the discriminant function analysis results obtained by Collard and Lycett. 10To reiterate, the latter did not support Wood and Lieberman's 2 hypothesis because the low-to-moderate strain measurement were more effective than the non-phenotypically plastic measurements and the best discrimination was achieved when phenotypic plasticity was ignored and all measurements were employed.The discriminant function analysis reported here did not support Wood and Lieberman's 2 hypothesis for the same reasons.The high-strain measurements were less effective at assigning specimens to species than either the low-to-moderate strain or non-phenotypically plastic measurements, but the low-to-moderate strain measurements were more effective at allocating specimens to species than the non-phenotypically plastic measurements.In addition, better discrimination was achieved when all the measurements were employed than when the non-phenotypically plastic measurements were utilised.
Taken together, the results of this study and those obtained by Collard and Lycett 10 provide partial support for Wood and Lieberman's 2 suggestion that mastication-related strain is capable of significantly increasing intra-specific variability in the bones of the hominin skull.However, they do not support Wood and Lieberman's 2 other suggestion-that strain-related phenotypic plasticity negatively impacts attempts to delineate species in the hominin fossil record.The studies indicate that discriminating among features of the cranium, mandible and dentition on the basis of their likelihood of exhibiting phenotypic plasticity not only does not automatically lead to significantly more reliable taxonomic hypotheses, but may in fact lead to less reliable taxonomic hypotheses.In our view, this finding argues against palaeoanthropologists adopting the course of action proposed by Wood and Lieberman. 2 With regard to alternative strategies for improving understanding of the species-level diversity of the fossil hominins, it is noteworthy that in both this study and Collard and Lycett's 10 study complete discrimination was achieved in the discriminant function analysis when all measurements were included.This suggests that simply maximising the number of characters may be an effective strategy to employ when attempting to delineate species in the hominin fossil record.Evaluating this possibility would seem to be an obvious next step.Given the fragmentary state of most hominin fossil specimens, key issues to address are how many characters it is necessary to use in order to be confident that specimens are correctly allocated to species, and whether that number is dependent on the skeletal element employed.

Fig. 1 .
Fig. 1.Plot of the first two discriminant functions from discriminant function analysis of high-strain measurements.97.6% of specimens were correctly classified to species.2.4% of Gorilla gorilla and 7.3% of Pan troglodytes were misclassified as Pongo pygmaeus.

Fig. 2 .
Fig.2.Plot of the first two discriminant functions from discriminant function analysis of low-to-moderate strain measurements; 100% of specimens were correctly classified to species.

Fig. 3 .
Fig. 3. Plot of the first two discriminant functions from discriminant function analysis of non-phenotypically plastic measurements.98.6% of specimens were correctly classified to species.2.4% of Gorilla gorilla and 2.4% of Pan troglodytes were misclassified as Pongo pygmaeus.

Fig. 4 .
Fig. 4. Plot of the first two discriminant functions from discriminant function analysis of all measurements; 100% of specimens were correctly classified to species.

Table 1 .
Measurements employed in this study.

Table 2 .
Mean coefficient of variation values and results of one-way ANOVA.