Psychoneuroendocrinology, Vol.
18, No. 7, pp. 535-539, 1993
0306-4530/93 $6.00 + .00 © 1993 Pergamon Press Ltd.
Printed in the U.S.A.
Reply to Letter by Graham One might get the impression from Graham's reply to my review (Wilson, 1992) of menstrual synchrony research that I say menstrual synchrony had not been demonstrated and that this conclusion is based on four studies failing to find statistically significant levels of menstrual synchrony. This impression would be incorrect. It is clear in the abstract and the text that my review is limited to three studies and two experiments based on the research design and methods of McClintock (1971). Also, I did not suggest by some errant logic that four studies failing to find synchrony are sufficient basis for concluding five studies reporting synchrony are in error. The four studies not reporting menstrual synchrony are introduced to indicate contradictory results were obtained by researchers using McClintock's methods, thus my reason for undertaking the reviews and examining seven of the more obvious variables that might explain the contradictory results among the studies and experiments. My conclusion "menstrual synchrony has not been demonstrated in any of the experiments or studies" is reached after an analysis of the studies and experiments. Dr. Graham's comments are directed to the first three pages of my review; however, the remaining 22 pages contain the material I thought would generate constructive criticism and discussion. In the reviews, I use my and other investigators' research to establish parameters for the behavior of menstrual onset absolute differences over consecutive onsets for a sample of randomly matched pairs of subjects. This material is presented in the form of three "errors" that can occur in menstrual synchrony research. I use these three errors to review research based on McClintock's methods for testing synchrony. I point out where errors are indicated by my analyses and that significant levels of menstrual synchrony do not occur in the samples when the errors are corrected. I thought Dr. Graham might take issue with some of the assumptions, reasoning, and calculations on which the three errors are based, or find fault with my analyses of the studies and experiments, especially since her study is one of those reviewed. Dr. Graham does raise an issue that merits discussion. She points out there are six studies that do not depend on McClintock's method of selecting a sample of women who had no previous social contact and traces their onset differences over time to see if they synchronize; rather, the samples are pairs or groups of women who had social contact before the beginning of the study period and should have already synchronized. All these studies report significant levels of menstrual synchrony. From among these six studies, Graham notes Matteo's (1987) study for using a different technique and Weller and Weller's (1992) study for using a novel method. An examination of these two studies seems appropriate, because both were published in this journal and are readily available to its readership. Matteo's (1987) study tests the effect of job stress and job interdependency on menstrual cycle length, regularity, and menstrual synchrony. My interest is limited to the question of synchrony. Her sample is five groups (n = 10, 10, 8, 7, and 6) of women who interact in job-related environments. Group synchrony is determined by the mean absolute deviation (mean deviation, hereafter) of individual onset dates from the group's mean onset date. The mean onset date she uses for calculating the onset mean deviation differs from that used by McClintock (1971) and Graham and McGrew (1980); both of 535
536
LETTERS TO THE EDITOR
those studies use the onset of each subject of a group occurring in a given month. Matteo determines each group's mean onset date using the first or second onset dates for each subject in the group. Sets of these onset dates are constructed so the "total number of sets represented all possible [unique] pairings of combinations of the first and second onset dates for each woman [in the group]" (p. 469). She gives as an example a group of four subjects and says the total number of sets is 12 based on the formula (2~ - n), where n is the number of subjects in the group. Actually, the number of possible sets is 16 or 2~, since each subject's onset date can take on two values, so 2 × 2 × 2 × 2 = 16. She next calculates the onset mean deviation for each of the sets, and "[t]he set that minimized the spread of onset dates around the group mean was selected as containing the true onset dates to be used for that group" (p. 470). Two of her groups have I0 subjects, so for these groups Matteo had 1,024 (or 1,014 using her calculation) onset mean deviations from which to chose the least score. Her method of searching for the least mean deviation is questionable, since "repeated random sampling experiments from a single population indicate that the successive values of mean deviation fluctuate too wildly . . for . . common . . usag e " (Thomas, 1986: p. 75). The important fact, however, is that Matteo does not conduct a test for synchrony in her groups, since the groups are not tested against a random sample of subjects nor any objective measure that could determine a significant level of synchrony. She tests for synchrony using a one-way ANOVA for the mean deviation scores of the five groups, finds one group of eight subjects has a significant greater mean deviation score, and concludes the other four groups are synchronized. Her claim for an objective measure of synchrony is to note the onset mean deviations for four of the groups are "well below the typically reported mean of 6.0 for synchrony" (p. 471), and she cites Graham and McGrew (1980) as an example. I know of no typically reported figure for group synchrony, and Graham and McGrew's study cannot be an example because they failed to find synchrony in the 15 living groups in their study, and their method for calculating the onset mean deviation is different from Matteo's method. Matteo could have tested for synchrony in her five work groups, however, using the groups' mean onset absolute differences. First, the correct closest onset absolute difference (onset difference, hereafter) is determined for all possible pairs of subjects ( n 2 - n ) / 2 , where n is the number of subjects in a group. The subjects' onset dates for April and May can be used for these calculations by following the method of determining the closest onset difference described in my review article (pp. 571-573). The group mean onset difference is derived from the sum of all the pairs' onset differences. The number of possible pairs in each group, according to the order of group size given above, is 45, 45, 28, 21, and 15, for a total of 154 pairs. A test random sample of subjects drawn from a population like that from which the five job-related groups were selected can be constructed using the 41 subjects in the five groups. A group mean onset difference for the random sample can be calculated from the onset differences of all possible pairs (820) using the same method as used for the five groups. However, this sample is biased because it includes the intra-group pairs of the five group samples. These pairs are removed from the sample (820 - 154 = 666), leaving a sample of 666 pairs of subjects who are randomly matched with regard to their possibility of having synchronous onset dates. The group mean of each of the five job-related groups can be tested against the group mean of the randomly matched sample. Since understanding of a procedure seems to increase exponentially with its application, I compared Matteo's method for calculating the group onset mean deviation with
LETTERS TO THE EDITOR
537
the method of calculating the group mean onset difference of all possible pairs. My sample is the four subjects (01, 09, 16, and 33) and their first and second onset dates in the random sample shown on Table V (p 586) in my review article. The mean cycle length of the four subjects is 30 days, based on this one cycle. The expected range of onset differences is 0-15 days, the expected group mean onset difference is 7.5 days, and one pair (22%) is expected to have an incorrect onset difference if only the first onset dates of each subject are compared. The actual range of the six corrected closest differences is 5-14 days, the group mean onset difference is 6.7 days, and two pairs had to have their onset differences corrected. This small group sample has the characteristics of a random sample. Using Matteo's method, I calculated the group mean deviation for all possible 16 sets of the eight onset dates. The range is from 5.0 to 24.0 days, and choosing the lowest value makes this group synchronous by "the typically reported mean of 6.0 days." The value of 5.0 days is based on the set of second onset dates of subjects 01, 09, and 33, and the first onset date of subject 16. I determined the onset difference for all possible pairs in this set of four onsets; the range of onset differences is 1-19 days and the mean onset difference is 9.7 days. Both the range and the mean values indicate this set of onset dates is not the closest onset difference between the subjects. I conclude Matteo does not demonstrate menstrual synchrony in her four groups, not on the basis of my example which I used for purposes of illustration, but because her technique for determining mean onset deviations is flawed, and because she does not test for synchrony. Weller and Weller's (1992) sample is 20 lesbian partners who had been living together for periods from 4 mo to 6 yr; 15 of the couples had been together 1 yr or longer. Their sample eliminates many of the confounding influences found in samples of college women: None of the women used contraceptives of any kind, the mean age was over 25 yr, and none of the women (probably) were emotionally involved with a male. Weller and Weller's novel method is to test observed against expected frequencies of onset differences. They note onset differences have an expected range of 0-14 days for a 28-day cycle length. Onset differences greater than 14 days are considered to be incorrect and are corrected by subtracting the greater of the two onsets from day 29, i.e., the assumed date of the other subject's second onset based on a 28-day cycle. They determine the expected frequency of onset differences of 0 and 14 days to be 1/28 and to be 1/14 for onset differences of 1-13 days. This could have been a definitive study of menstrual synchrony, but the basic data for a test were not obtained. The date of only one menstrual onset was collected and that date was retrospective. (Contrast this method of collecting onset dates with that used by Matteo.) Recall data, especially where an error of 2 or 3 days can make a significant difference in the results for a small sample, are not reliable. Bernard et al. (1984: p. 503) surveyed over I00 studies on the validity of retrospective data covering periods of recall ranging from 24 h to several months, and they state: "The results of these studies leads to one overwhelming conclusion: on average, about half of what informants report is probably incorrect in some way." Further, the closest onset date cannot be determined unless two consecutive onset dates are recorded for each subject. If only one onset for each subject is used to determined the onset difference, about 22% of the onset differences are incorrect. Thus, probably 4-5 onset differences in Weller and Weller's sample had to be calculated by assuming the length of the menstrual cycle of one of the subjects in each pair was 28 days. However, 19 of the 40 subjects had irregular cycles. An actual cycle length of 31 days, for example, produces a 3-day error
538
LETTERS TO THE EDITOR
in the calculation of the closest onset difference, if the assumption is the cycle was 28 days. As I show in my review (pp 580--581), correcting two incorrect onset differences in Graham and McGrew's (1980) sample of 18 pairs of close friends changes the outcome from a significant to a nonsignificant level of menstrual synchrony. I conclude Weller and Weller failed to collect the data necessary for a test of synchrony, although their novel testing method is a valuable addition to menstrual synchrony research. Both the method of using the onset difference between all possible pairs described above and the method of comparing the frequencies of observed and expected onset differences have potential for expanding the scope of menstrual synchrony research, since many types of ongoing social groups of women can be tested, and the two onset dates can be collected in only a few weeks. The onsets have to cover the same time period for the all possible pairs test in order to constitute the random sample. Weller and Weller's expected frequencies test has no such restriction; however, the expected frequencies of onset differences should be calculated for the sample mean cycle length based on the one recorded cycle of the subjects and not for an assumed cycle length of 28 days. It would be interesting if Dr. Matteo tried both these tests with her data. Her results would be of considerable interest to those of us interested in menstrual synchrony research. To return to Graham's comments on my review, she concludes 10 studies and two experiments have replicated McClintock's study and support her findings, while two studies failed to replicate McClintock's results. I will not quibble with her inclusions and exclusions in this list, except to defend citing PfatVs (1980) Master's thesis that is a public document in the library of a state university. Scientific research is not a democratic endeavor; 12 for and 2 against does not prove the case for menstrual synchrony. What counts is the quality of the research. After my two reviews above, I still conclude it is an open question whether or not menstrual synchrony occurs in humans. However, my reviews are not intended to demonstrate menstrual synchrony does not occur, a logically impossible task, but to contribute to improving the methods by which menstrual synchrony research is conducted. Finally, I wish to correct a mistake in my review article; the name in footnote 5 (p 581) should be Dr. Patsy Sampson. H. Clyde Wilson Department of Anthropology University of Missouri Columbia, Missouri U.S.A.
REFERENCES Bernard HR, Killworth P, Kronenfeld D, Sailer L (1984) The problem of informant accuracy: the validity of retrospective data. Annu Rev Anthropology 13:495-517. Graham CA, McGrew WC (1980) Menstrual synchrony in female undergraduates living on a coeducational campus. Psychoneuroendocrinology 5:245-252. Matteo S (1987) The effect of job stress and job interdependency on menstrual cycle length, regularity and synchrony. Psychoneuroendocrinology 12:467-476. McClintock MK (1971) Menstrual synchrony and suppression. Nature 229:244~ 245. Pfaff MJU (1980) Menstrual synchrony: Fact or fiction? MA Thesis, Univ of Idaho, Moscow, ID.
LETTERS TO THE EDII"OB
539
Thomas DH (1986) Refiguring Anthropology: First Principles of Probability & Statistics. Waveland Press, Prospect Heights, IL. Weller A, Weller L (1992) Menstrual synchrony in females couples. Psychoneuroendocrinology 17:171-177. Wilson HC (1992) A critical review of menstrual synchrony research. Psychoneuroendocrinology 17:565-591.