Psychoneuroendocrinology 24 (1999) 813 – 822 www.elsevier.com/locate/psyneuen
Testosterone levels and spatial ability in men Irwin Silverman *, Don Kastuk, Jean Choi, Krista Phillips Department of Psychology, York Uni6ersity, 4700 Keele Street, Toronto, Ont. M3J 1P3, Canada Received 13 October 1998; accepted 8 May 1999
Abstract Testosterone (T) levels were measured by salivary assays in 59 males at times of the day when T was expected to be highest and lowest. Relationships were evaluated for mean hormone levels across the two sessions and hormone level changes between sessions with performance on three-dimensional mental rotations, a spatial test which customarily favours males. An anagrams task and the digit symbol test were used as controls. Mental rotations scores showed a significant positive relationship with mean T levels but not with changes in T. There were no significant relationships between control test scores and mean T levels. Findings are discussed in terms of their contributions to the resolution of ambiguities in prior reported data. © 1999 Elsevier Science Ltd. All rights reserved. Keywords: Testosterone; Spatial ability; Mental rotations; Sex differences
1. Introduction Sex differences in spatial test performance have commonly favoured males. Earlier attempts to account for this phenomenon focused mainly on socialization practices (Maccoby and Jacklin, 1974), but the generality of the differences across populations and situations, as well as its observation in infra-human species led to the inclusion of biological and, particularly, hormonal factors (see Linn and Petersen, 1985; Gaulin and Hoffman, 1988; Reinisch et al., 1991; Williams and Meck, 1991; Hampson and Kimura, 1992; Silverman and Phillips, 1993, for reviews). This investigation was supported in part by a grant to the first author by the Social Sciences and Humanities Research Council of Canada. * Corresponding author. Fax: + 1-416-736-5814. E-mail address:
[email protected] (I. Silverman)
0306-4530/99/$ - see front matter © 1999 Elsevier Science Ltd. All rights reserved. PII: S 0 3 0 6 - 4 5 3 0 ( 9 9 ) 0 0 0 3 1 - 1
814
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
Studies with females have generally found decreases in spatial performance with increased estrogen (E) levels. Investigations of the relationship of testosterone (T) with spatial abilities in males, however, have not demonstrated, unequivocally, the counterpart positive relationship. A review of studies using direct assessments of T levels by assays shows positive relationships (e.g. Christiansen, 1993; Janowsky et al., 1994), negative relationships (e.g. Gouchie and Kimura, 1991; Kimura and Hampson, 1994) and null relationships (e.g. McKeever et al., 1987; McKeever and Deyo, 1990). The pattern of these discrepancies may suggest a curvilinear relationship, which is consistent with an early theory by Petersen (1976), based on the observation that more physically androgynous individuals of both sexes tend to do best on spatial tests. Petersen postulated an inverted U-shaped curvilinear function between spatial ability and gonadal hormones, with the latter conceptualized on a continuum ranging from extreme feminine to extreme masculine with androgyny at the peak. The premise of the theory is that, at optimal T levels, males spatial abilities occupy the area of the peak. Consequently, both decreases and increases in T from optimal levels have the effect of moving these abilities away from the peak, albeit in opposite directions, resulting in lowered spatial performances in both cases. Nyborg (1983) presented an alternative theory of hormonal effects on spatial performance, based on the conversion of some plasma testosterone to brain estrogen. Nyborg proposed an inverted U-shaped relationship between brain estrogen and spatial ability, with males tending more than females to occupy the peak of the curve. This theory can also account for the proposed inverted U-shaped relationship between T levels and spatial performance in males, based on the same logic as Petersen (1976). Both decreases and increases in T may be presumed to move the individual from the peak of the curve defining optimal brain estrogen level in either direction, reducing spatial performance in both instances. It should be noted, however, that Nyborg’s theory assumes that T effects operate solely or mostly through aromatization, eschewing the fact of androgen receptors throughout the cortex and in regions without estrogen receptors. The major weakness with the curvilinearity hypothesis, however, is that though independent reports of positive, negative or null relationships between T and spatial performance continue to appear, no study has actually obtained the proposed inverted U-function in males. Moffat and Hampson (1996) found an inverted U-shaped curvilinear relationship across sexes between T and spatial performance, but when male scores in this distribution were observed by themselves, they showed a distinct negative linear function. It is possible that the scattering in the literature of positive and negative correlations between T and spatial ability represent Type I statistical errors and no actual relationship exists. It is also feasible that the discrepancies are a function of diverse methodologies and measurement techniques. An examination of the extant literature by the present authors, however, failed to reveal systematic differences in subject samples or in measures of T or spatial abilities which related in any way to differences in results.
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
815
The present study comprised an additional attempt to establish the relationship of T to spatial performance in males with a research design in which both inter-individual and intra-individual differences in T levels were used in the analyses, and both the reliability and validity of T levels were assessed.
2. Method
2.1. Subjects Subjects were 64 male and 20 female students at York University in Toronto, Canada. Mean ages were 22.42 (SD =3.02) and 23.61 (SD = 3.94), respectively, a difference which did not achieve significance at 8 = .05. Both males and females were paid volunteers solicited in the same manner, through announcements posted in the Behavioural Sciences Building on campus. Three male subjects were graduate students and the balance were undergraduates, mostly from introductory psychology courses. For the female sample, one subject was a graduate student and the rest undergraduates, most of whom also came from introductory psychology courses.
2.2. Spatial performance This was assessed by the Vandenberg and Kuse (1978) adaptation of the Shepard and Metzler (1971) three-dimensional mental rotations test. This test was considered to be the most appropriate for the present study for a variety of reasons. Vandenberg and Kuse maintain that three-dimensional mental rotations tasks are superior to other spatial tests inasmuch as they are the least accessible to solution by verbal means. Congruent to this assertion, Linn and Petersen (1985), using a meta-analysis technique based on the nature of the cognitive processes involved in solving individual spatial tests, concluded that mental rotations abilities occupy their own separate factor. These authors also noted that, ‘large sex differences are found only for measures of mental rotation’, (p. 1479) with the Vandenberg and Kuse test showing the largest differences of all. Similarly, Silverman and Phillips (1993) and Phillips and Silverman (1997) indicated that the Vandenberg and Kuse test tended to show the largest and most reliable relationships with estrogen levels in female subjects. Each of the 24 items of the Vandenberg and Kuse (1978) test consists of a target drawing and four test drawings, with subjects required to designate which two of the four test drawings depict the target drawing in rotated positions. For the present scoring system, a point was added for each correct answer and subtracted for each incorrect answer, with subjects apprised of this procedure beforehand. Two equivalent forms of 12 items were prepared for the two testing sessions; thus, the theoretical range of scores for each test was −24 to + 24. Four minutes were allowed for the test.
816
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
2.3. Control tests These comprised the digit symbol test, which measures visual motor coordination, and an anagrams test of verbal ability. Control tests were selected which were cognitive but not spatial in nature and which did not show a male advantage in prior research. The purpose of this was to ensure that hormonal effects, if obtained, were specific to the sex difference in spatial performance. For both control tests, previous studies have shown either no sex differences or a bias in favour of females (Maccoby and Jacklin, 1974). For the digit symbol test, subjects are required to follow a key at the top of the form containing the numbers 1 through 9, each associated with a different symbol. The task is to copy the correct symbols, as quickly as possible, under a series of single digit numbers. The standard form for this test was also divided into two for the two testing sessions, each containing 93 items. A total of 90 s were allowed and subjects were scored a point for each correct response. Anagrams requires subjects to unscramble a series of letters to form a word. The present tests consisted of 18, five-letter anagrams for each of the two forms, with no items repeated between forms. Four minutes were allowed and a point was scored for each valid answer.
2.4. Hormonal measures Male subjects were informed when recruited that hormone levels would be assayed from salivary samples as part of the study, and were instructed to abstain from smoking or imbibing in caffeine products such as coffee, tea, cola or chocolate 1 h prior to testing. They also answered a brief health questionnaire designed to ascertain whether drugs or medication of any kind were being used which could affect results. Female subjects were not tested for hormone levels, but were included to ensure that the anticipated sex differences occurred only in the spatial test, and not in the non-spatial control tests. Two 15 ml vials of salivary samples were collected from each subject throughout the session: at the beginning, during rest periods between tests and at the close, if necessary. Samples were kept frozen until analyzed at the Salivary Radioimmunoassay Laboratory of the University of Western Ontario. Estrogen (E) levels were also assessed from these samples and subjected to the same statistical analyses as T levels as a point of possible incidental interest. Following normal laboratory procedures, T and E assays were conducted in duplicate. For each subject, obtained values from saliva collected in the first and second vials were averaged in order to provide reliable estimates. Hormonal levels are reported in pg/ml. During the interval between transport of the first and second batch of salivary samples, however, the laboratory had changed to a different antiserum for E assays, resulting in slightly higher absolute values. This was corrected by converting E values into standardized (z) scores, separately for each batch.
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
817
The inter-assay coefficient of variation across the first and second batches, calculated from low, medium and high pools, ranged from 9 to 18% for T and from 30 to 56% for E.
2.5. Procedure Sessions were conducted in groups of two to four subjects, with each subject occupying a separate cubicle. All male subjects were tested by the same male researcher and all female subjects by the same female researcher. Two test batteries were constructed, using the dual forms for each test, with the sequence of tests being mental rotations, digit symbol and anagrams for both batteries. For each sex, the order of sessions for half the subjects was 0800 h followed by 2000 h on the same day and, for the other half, 2000 h followed by 0800 h of the next day. Test batteries were also counterbalanced by order of sessions; thus possible practice effects and test battery differences were controlled. Time limits were imposed on all tests which allowed few subjects to complete all items; thereby, ceiling effects were not considered to be a factor.
2.6. Statistical analyses Subjects presumed normal hormonal levels were represented by mean scores across the 0800 h and 2000 h sessions, and intra-individual changes in hormonal levels were measured in terms of difference (D) scores between sessions. The criterion for establishing test-retest reliability for T and E resided in two-tailed Pearson correlation coefficients (rs) between sessions. Sex differences for cognitive tests were analyzed by Multivariate Analysis of Variance (MANOVA), followed by Univariate Analyses of Variance (ANOVAs). Differences between 0800 h and 2000 h sessions, for hormone levels and cognitive tests, were analyzed by repeated measures ANOVAs. Validity of T measures was determined by whether the expected significant decrease between 0800 h and 2000 h samples was obtained. The relationships between hormonal levels and cognitive tests were analyzed by two-tailed Pearson rs, for both mean scores across sessions and D scores between sessions.
3. Results Cognitive or hormonal tests for five male subjects were invalid for various reasons; hence the resultant male sample size was 59.
3.1. Sex differences Table 1 shows mean scores across sessions for the three cognitive tests, separately by sex. The MANOVA was significant (F[3,75]=4.15, PB .01), and the subsequent univariate F values are shown in the table.
818
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
Table 1 Mean scores (SD) on spatial and control tests in males and females Test
Males
Females
F
Mental rotations Anagrams Digit symbol
11.86 (5.98) 6.64 (2.99) 66.78 (8.88)
8.35 (4.62) 7.30 (2.68) 69.28 (10.15)
5.68* 0.77 1.11
* PB.05.
As expected, males mental rotations scores were significantly higher than females. There were no significant sex differences for the two control tests, with trends of the differences in the opposite direction than for the spatial test.
3.2. Validity and reliability measures Table 2 shows differences and correlations between 0800 and 2000 h scores for both hormonal assays and the three cognitive tests for male subjects. T levels showed the anticipated significant decrease between 0800 and 2000 h sessions and a significant positive relationship between sessions. E levels, however, showed an unanticipated zero-order correlation between sessions. Given this failure to establish test – retest reliability and the fact that coefficients of variations, noted above, exceeded the acceptable range, it was considered that E measures were compromised in some manner. Hence, E levels were not considered further in the results. Spatial and control tests all showed significant, positive correlations between sessions. The sole significant difference between sessions among the cognitive tests was digit symbol, showing an increase between 0800 and 2000 h.
3.3. Correlations between T le6els and cogniti6e test scores Table 3 presents the correlation coefficients between T levels and cognitive test performances, both in terms of mean scores across sessions and D scores between Table 2 Mean differences and correlations as a function of time of testing in males Variable
0800 h
2000 h
F
r
T level E level Mental rotations Anagrams Digit symbol
84.20 (29.72) 0.23 (0.94) 12.17 (6.32) 6.27 (3.23) 64.97 (9.96)
55.48 (18.95) −0.23 (1.00) 11.56 (7.16) 7.00 (3.62) 68.59 (10.33)
75.78** 7.02* 0.58 2.84 7.59**
0.53** 0.04 0.59** 0.53** 0.50**
* PB.05. ** PB.01.
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
819
Table 3 Pearson correlation coefficients (r) between mean (M) and difference (D) testosterone measures and cognitive test scores
T level M D
Mental rotations
Anagrams
Digit symbol
.28* −.08
−.04 −.10
.19 .04
* PB.05.
sessions. For mean scores across sessions, T related positively and significantly with mental rotations and did not show significant relationships of any kind with either of the control tests. For difference scores between sessions, there were no significant correlations for T with spatial or control tests. Separate regression analyses for both mean T levels and D scores with mental rotations failed to show negative quadratic (inverted U-shaped) functions in either case.
4. Discussion There was a significant and marked difference between morning and evening T levels in the anticipated direction, supporting the validity of these measures. T measures also appeared to be reliable; inter-assay coefficients of variation were in the acceptable range and the Pearson r between the 0800 and 2000 h sessions was positive and significant. It should be noted that to the extent that T level changes based on the diurnal cycle were variable between subjects, the correlation coefficient between sessions would have been reduced. Hence, though the obtained reliability was reasonably robust, it probably would been even stronger if subjects were tested twice at the same time of day. From the standpoint of the goals of this study, the most critical finding was the significant positive relationship of T with mental rotations coupled with the absence of any such relationship for control tests. These results were not replicated for D scores, but this does not necessarily compromise the findings for mean scores. Change scores based on hormonal cycles or other transient events have been used to represent activational effects (e.g. Janowsky et al., 1994; Moffat and Hampson, 1996), and these may or may not accompany effects based on normal circulating hormone levels, depending on the trait in question (Beatty, 1984). It is noteworthy, however, that Mackenberg et al. (1974) did find a decrease in spatial test scores in males between 0930 and 1530 h, using a within-subjects design as in the present study. They did not, however, take direct measures of T and hence could not use intra-individual correlations between T and spatial scores to assess the effects of hormone level changes. Thus, time-of-day effects may have been a confounding factor in the Mackenberg et al. results.
820
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
It should also be noted that the reliability index for mental rotations in the present data was somewhat lower than reported in Qubeck (1997), although different versions of the mental rotations test and different scoring methods were employed between studies. Further, Qubeck used an odd–even, split-half method whereby the present study used a more conservative test–retest method with alternative forms. The lack of consensus among published studies of the relationship between T and spatial ability in males remains, but one recent report (Moffat and Hampson, 1996), when considered with the present study, may provide a novel explanation. As in the present study, Moffat and Hampson took salivary samples from paid volunteer male university students at different times in the diurnal cycle for T, although they employed a between-subjects rather than within-subjects design. They also used the Vandenberg and Kuse (1978) mental rotations test, albeit an earlier version, and the services of the University of Western Ontario Salivary Radioimmunoassay assay laboratory. The mean T score for their 0815 h group was 85.79 pg/ml (SD=22.76), which was roughly equivalent to the 84.20 pg/ml (SD= 29.72) for our 0800 h group. They used late morning rather than evening samples to represent T decreases, eliminating the possibility of further direct comparisons of T distributions; however, their mid-morning scores did fall, as would be expected, between our 0800 and 2000 h scores. Despite these similarities between the studies, Moffat and Hampson (1996) found a significant inverse relationship between overall T levels and mental rotations scores (N =40, r = −.44). This discrepancy, in view of the similar T levels between studies, belies the notion of an inverted U-shaped function, which presumes that studies reporting positive relationships are based on lower T values than those reporting negative relationships. Scrutinizing both studies, it appears that the sole salient difference resided in mental rotations performance, with Moffat and Hampson’s (1996) University of Western Ontario (UWO) students showing seemingly higher scores for both males and females than our York University students. Groups could not be compared directly inasmuch as different versions of the Vandenberg and Kuse test were used; however, two prior studies at York (Silverman and Phillips, 1993) which employed the same version of the mental rotations test as did Moffat and Hampson, also showed lower scores for both sexes. Mean mental rotations scores for the two York samples were 15.57 (SD=9.42) and 14.50 (SD= 7.56) for males, and 8.17 (SD= 8.38) and 5.58 (SD=7.38) for females, respectively, compared to UWO scores of 23.13 (SD =10.26) for males and 15.65 (SD = 8.33) for females. Again, significance tests could not be applied because Moffat and Hampson allowed 8 min for the test while Silverman and Phillips allowed seven. Despite this time difference, the divergence between scores appears striking. This may suggest that, for some reason, T enhances performance when the task is difficult for the subject but has the converse effect when the task is simple. For example, T may serve to keep the subject focused when confronting a difficult test but interfere with concentration for simpler tasks. These may be speculative notions, but given the lack of explicit evidence for the curvilinearity hypothesis,
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
821
they do provide a viable alternative explanation of long-standing ambiguities in the relationship of T and spatial ability in males. They also may explain the positive relationship of T levels and spatial performance in females, in that female spatial scores tend to be lower than males and females report a higher amount of stress experienced in response to these tasks than do males (Lawton, 1994).
References Beatty, W.W., 1984. Hormonal organization of sex differences in play fighting and spatial behavior. Prog. Brain Res. 61, 315–330. Christiansen, K., 1993. Sex hormone-related variations of cognitive performance in Kung San Hunter — gatherers of Nambibia. Bio. Psychol. 27, 97 – 107. Gaulin, S.J.C., Hoffman, H.A., 1988. Evolution and development of sex differences in spatial ability. In: Betzig, L., Mulder, M.B., Turke, P. (Eds.), Human Reproductive Behavior: A Darwinian Perspective. Cambridge University Press, Cambridge, pp. 129 – 152. Gouchie, C., Kimura, D., 1991. The relationship between testosterone levels and cognitive ability patterns. Psychoneuroendocrinology 16, 323– 324. Hampson, E., Kimura, D., 1992. Sex differences and hormonal influences on cognitive function in humans. In: Becker, J.B., Breedlove, S.M., Crews, D. (Eds.), Behaviourial Endocrinology. MIT, Cambridge, pp. 357–398. Janowsky, J.S., Oviatt, S.K., Orwoll, E.S., 1994. Testosterone influences spatial cognition in older men. Behav. Neurosci. 108, 325–332. Kimura, D., Hampson, E., 1994. Cognitive pattern in men and women is influenced by fluctuations in sex hormones. Curr. Dir. Psychol. Sci. 3, 57 – 61. Lawton, C.A., 1994. Gender differences in way-finding strategies: relationship to spatial ability and spatial anxiety. Sex Roles 30, 765–779. Linn, M.C., Petersen, A.C., 1985. Emergence and characterization of sex differences in spatial ability: a meta-analysis. Child Dev. 56, 1479–1498. Maccoby, E.E., Jacklin, C.N., 1974. Psychology of Sex Differences. Stanford University Press, Stanford. Mackenberg, E.J., Broverman, D.M., Vogel, W., Klaiber, E.L., 1974. Morning to afternoon changes in cognitive performances and in the electroencephalogram. J. Educ. Psychol. 66, 238 – 246. McKeever, W.F., Deyo, R.A., 1990. Testosterone, dihydrotestosterone, and spatial task performance of males. Bull. Psychonom. Soc. 28, 305–308. McKeever, W.F., Rich, D.A., Deyo, R.A., Conner, R.L., 1987. Androgens and spatial ability: failure to find a relationship between testosterone and ability measures. Bull. Psychonom. Soc. 25, 438 – 440. Moffat, S., Hampson, E., 1996. A curvilinear relationship between testosterone and spatial cognition in humans: possible influence of hand preference. Psychoneuroendocrinology 21, 323 – 337. Nyborg, H., 1983. Spatial ability in men and women: review and new theory. Adv. Behav. Res. Ther. 5, 89 – 140. Qubeck, W.J., 1997. Mean differences among subcomponents of Vandenberg’s mental rotation test. Percept. Motor Skills 85, 323–332. Reinisch, J.M., Ziemba-Davis, M., Saunders, S.A., 1991. Hormonal contributions to sexually dimorphic behavioral development in humans. Psychoneuroendocrinology 16, 213 – 278. Petersen, A.C., 1976. Physical androgyny and cognitive functioning in adolescence. Dev. Psychol. 12, 524 – 533. Phillips, K., Silverman, I., 1997. Differences in the relationship of menstrual cycle phase to spatial performance on two-and three-dimensional tasks. Horm. Behav. 32, 167 – 175. Shepard, R.N., Metzler, J., 1971. Mental rotation of three dimensional objects. Science 171, 701 – 703.
822
I. Sil6erman et al. / Psychoneuroendocrinology 24 (1999) 813–822
Silverman, I., Phillips, K., 1993. Effects of estrogen changes during the menstrual cycle on spatial performance. Ethol. Sociobiol. 14, 250–270. Vandenberg, S.G., Kuse, A.R., 1978. Mental rotations: a group test of three-dimensional spatial visualization. Percept. Mot. Skills 47, 599–604. Williams, C.L., Meck, W.H., 1991. The organizational effects of gonadal steroids on sexually dimorphic spatial ability. Psychoneuroendocrinology 16, 155 – 176.
.