14 Japan
Kazuhiko Ueno Tokyo Gakugei University Tokyo, Japan
Ichiro Nakatani Japan Institute of Psychological Aptitude Nihon Bunka Kagakusha Tokyo, Japan
All the Wechsler Intelligence tests have been standardized and used in Japan. The original Wechsler Intelligence Test for Children (WISC) was first published in 1953, the WISC-R was published in 1978, and the WISC-III in 1998 (Wechsler, 1998). As for other Wechsler intelligence tests, the Wechsler Preschool and Primary Scale of Intelligence (WPPSI) was published in 1969, the Wechsler Adult Intelligence Scale (WAIS) in 1958, the WAIS-R in 1990, and the WAIS-III is in process of standardization. The WISC-R has been used extensively in Japan, mainly at schools and hospitals for screening children with mental disabilities. The Conference for the Learning Disabilities started with an investigation in the Ministry of Education in 1990. As a result, the utilization needs of WISC-III as a diagnosis test in the educational sector are expanding further today. The Japan Institute of Psychological Aptitude, which has played a major role in the dissemination of the WISC-R, established a special team to translate and adapt the Japanese version of the WISC-III in 1991 (Wechsler, 1991). Our team completed the translation and standardization of the Japanese version in 1998 (Wechsler, 1998). Culture and Children’s Intelligence: Cross-Cultural Analysis of the WISC-III
215
Copyright 2003, Elsevier Science (USA). All rights reserved.
216
Kazuhiko Ueno and Ichiro Nakatani
ADAPTATION PROCESS The procedures followed in the translation and adaptation of the WISC-III in Japan were based on the rich experiences of the adaptation and standardization of the Wechsler tests in the past. The items were translated into Japanese and back-translated into English and checked for equivalence of meaning. Before the standardization, we conducted two case studies, one as pilot test and the other as tryout test to select the items for each subtest. The number of subjects involved with the pilot test were n = 330 including the group tests, and the tryout test was based on n = 160. When selecting items, we calculated the passing rate and the point biserial correlation coefficient in each age group. Then, eliminating the items of the same level of difficulty, we selected items that were highly correlated with the score of each subtest. Consequently, 50% of items in Information, 40% in Picture Completion, and 25% in Comprehension were replaced with new items. What was always challenging in the process of developing Japanese versions of the WISC-R and WAIS-R was the great difference in language, culture, and history between the U.S. and Japan. A number of U.S. items were culturally biased and Japanese items were substituted. For example, in Picture Completion, for consideration of people with disabilities, the missing parts on the body in people and animals were replaced with those on things. For example, “animal ear” and the “animal foot,” and parts of the human body were replaced with “vehicle tire” and “stockings.” On the Information subtest, historical persons, scientists, authors, and geographic questions were replaced with those familiar to Japanese people. As a result, many of the items were replaced with new items that have no relation to U.S. items. The Japanese Vocabulary items were selected anew and were not dependent on the translation of the existing U.S. items. In developing the items for Vocabulary, we divided the developmental phase into four phases; the infant phase (4–6 years old), the elementary school phase (7–12 years old), the junior high phase (13–15 years old), and the high school phase (16 years old and above). In selecting items for Vocabulary in each phase, the results from previous studies were used. The references for the infant phase are from Okubo and Kawamata (1982), Vocabulary for Under School Age, Report of the National Institute for Japanese Language 71, The National Institute for Japanese Language, and Report of the National Institute for Japanese Language (1980), Vocabulary of Infants, Tokyo-shoseki. The references for the elementary school phase are the Journal of Japan Educational Research Institute 35 (1985), Studies on the Basic Vocabulary in School Learning I, Kyoikushuppan, and Sakamoto (1959), Shin Kyoiku Kihon Goi (New Basic Vocabulary in Education), Gakugei-tosho. The references for the junior high and high school phases are Hayashi (1971), Vocabulary Research and Basic Vocabulary, Report of the National Institute for Japanese Language 39, the National Institute for Japanese Language. The number of vocabulary items selected for the pilot test was 6 for the infant phase, 20 for the elementary school phase, 12 for the junior high, and 6 for the high school, totaling 44 potential items. For the data from the pilot and preliminary tests,
217
14. Japan
we calculated the degree of difficulty and the point biserial correlation coefficient in each age group and finalized the items of selection. As a result, only two U.S. items were retained.
SAMPLE CHARACTERISTICS The Japanese WISC-III norms are based on a standardization sample representative of the Japanese population of children and adults. An analysis of data gathered in 1995 by the Basic Survey of Schools of the Ministry of Education provided the basis for stratification across the following variables: age, gender, and geographic region. For the standardization sample, the actual ages of the target groups were subdivided into 12 age groups by every 12 months starting from ages 5 years 0 months 0 days to 16 years, 11 months and 30 days. We set a target number of 1200 children in total 100 for each age group. Concerning gender, 50 children in each age group were used as targets in order to make the sampling of both genders even. The number of children in the final standardization sample was n = 1125, which was 94% of what we had expected and enough data for statistical analysis. It is generally acknowledged that there are no cultural differences in different localities in Japan. However, in this standardization sample we were careful to avoid sampling bias regarding geographical location. Therefore, we first divided Japan into eight geographic areas and allocated the target number of samples to each area. Then, we further divided Japan into two major areas, east and west, to collect data in accordance with the ratios of the Census of Japan. The census indicates the population ratio between east and west Japan to be 3 to 2 (Statistics Bureau of the Management and Coordination Agency, 1992). As Table 14.1 shows, the proportions of children sampled in these geographical areas were a similar ratio to those in the population based on census statistics. Testers who had completed training at the WISC-R Skill Seminar held by the Japan Institute of Psychological Aptitude were asked to administer the WISC-III for the standardization phase. In selecting subjects, we picked children randomly from classrooms to avoid selection bias.
RELIABILITY Table 14.2 presents the split-half reliability coefficients and standard errors of measurement for the subtests, IQs, and index scores by age and overall. As shown in the table, the average reliability for the three IQ scores was above 0.90. The average reliabilities for the index scores ranged from r = 0.81 for Processing Speed Index (PSI) to r = 0.89 for Perceptual Organization Index (POI). The average subtest reliabilities ranged from r = 0.64 for Object Assembly to r = 0.87 for Digit Span. The Standard Errors of Measurement (SEms) were correlated with the reliabilities in that those subtests with the highest reliabilities had the smallest standard errors.
218
Kazuhiko Ueno and Ichiro Nakatani
TABLE 14.1 Demographic Characteristics of the Standardization Sample: Percentages by Age Geographic region Age
n
East Japan
West Japan
5 6 7 8 9 10 11 12 13 14 15 16
96 93 94 88 102 82 114 87 82 81 106 100
58.3 61.3 69.1 59.1 67.6 67.1 62.3 59.8 63.4 58.0 69.8 53.0
41.7 38.7 30.9 40.9 32.4 32.9 37.7 40.2 36.6 42.0 30.2 47.0
1125
62.5 60.9
37.5 39.1
Total Japan population
Table 14.3 shows the average corrected retest correlations for the subtests, IQs and index scores. The retest intervals are between 2 weeks to 6 months (average 76.2 days), and age ranged between 6 years 2 months and 15 years 10 months (average 10 years 7 months). These data were based on 84 children. The retest correlations for the IQ scores ranged from r = 0.83 for Performance IQ (PIQ) to r = 0.93 for Verbal IQ (VIQ) and Full Scale IQ (FSIQ). The retest correlations for the index scores ranged from r = 0.78 for PSI to r = 0.91 for Verbal Comprehension Index (VCI). The retest correlations for the subtests ranged from r = 0.54 for Picture Arrangement to r = 0.89 for Information.
EVIDENCE OF VALIDITY The WISC-III was compared to other major intelligence tests, including the Tanaka-Binet Intelligence Scale, the Kaufman Assessment Battery for Children (K-ABC), and Illinois Test of Psycholinguistic Abilities (ITPA). Because of space limitations, we describe the correlation between the WISC-III FSIQ score and the Tanaka-Binet IQ score. The WISC-III and the Tanaka-Binet were administered to a sample of 38 children. The FSIQ correlation with Tanaka-Binet IQ is r = 0.74. The results suggested that the WISC-III has a strong relationship with the Tanaka-Binet. The result of the WISC-III intercorrelations averaged across all age groups is presented in Table 14.4. The Verbal subtests had higher intercorrelations with the other Verbal subtests than the Performance subtests. The Performance subtests had higher intercorrelations with other Performance subtests than the Verbal subtests.
TABLE 14.2 Reliability Coefficients and Standard Errors of Measurement of the Subtests Scaled Scores, IQ Scores, and Index Scores by Agea Age in years 5
6
7
8
9
10
11
12
13
14
15
16
Averageb
Information (SEm) Similarities (SEm) Arithmetic (SEm) Vocabulary (SEm) Comprehension (SEm) Digit span (SEm)
0.56 1.99 0.76 1.46 0.87 1.08 0.84 1.21 0.73 1.55 0.82 1.28
0.67 1.73 0.79 1.36 0.73 1.55 0.72 1.60 0.65 1.77 0.86 1.13
0.78 1.39 0.55 2.01 0.81 1.32 0.69 1.68 0.62 1.85 0.82 1.27
0.80 1.35 0.70 1.63 0.80 1.34 0.77 1.45 0.67 1.73 0.81 1.31
0.76 1.48 0.70 1.63 0.85 1.17 0.74 1.52 0.77 1.43 0.89 0.99
0.81 1.30 0.68 1.69 0.77 1.45 0.80 1.36 0.86 1.10 0.88 1.06
0.85 1.16 0.68 1.71 0.77 1.43 0.87 1.07 0.84 1.22 0.85 1.16
0.78 1.40 0.64 1.81 0.62 1.85 0.73 1.55 0.82 1.27 0.90 0.97
0.88 1.03 0.61 1.88 0.81 1.29 0.88 1.05 0.82 1.26 0.88 1.04
0.85 1.15 0.72 1.59 0.84 1.21 0.92 0.87 0.74 1.53 0.88 1.03
0.90 0.95 0.79 1.36 0.85 1.14 0.91 0.92 0.88 1.04 0.90 0.95
0.90 0.97 0.57 1.96 0.82 1.26 0.90 0.96 0.80 1.34 0.90 0.93
0.81 1.30 0.69 1.67 0.80 1.33 0.83 1.25 0.78 1.41 0.87 1.09
Picture completion (SEm) Coding (SEm) Picture arrangement (SEm) Block design (SEm) Object assembly (SEm)
0.87 1.07 0.75 1.51 0.68 1.70 0.76 1.46 0.58 1.94
0.73 1.55 0.75 1.51 0.63 1.84 0.84 1.21 0.58 1.94
0.84 1.20 0.75 1.51 0.85 1.14 0.81 1.32 0.58 1.94
0.89 1.01 0.88 1.03 0.74 1.53 0.79 1.38 0.67 1.71
0.80 1.36 0.88 1.03 0.72 1.58 0.89 1.01 0.54 2.04
0.80 1.34 0.88 1.03 0.76 1.45 0.83 1.24 0.50 2.13
0.77 1.43 0.88 1.03 0.72 1.58 0.88 1.05 0.49 2.15
0.72 1.57 0.88 1.03 0.58 1.94 0.86 1.10 0.43 2.27
0.69 1.66 0.67 1.73 0.54 2.03 0.86 1.11 0.70 1.64
0.83 1.25 0.67 1.73 0.79 1.38 0.88 1.04 0.75 1.49
0.74 1.52 0.67 1.73 0.73 1.56 0.89 1.02 0.84 1.19
0.77 1.44 0.67 1.73 0.68 1.69 0.90 0.95 0.78 1.39
0.80 1.36 0.78 1.40 0.71 1.61 0.85 1.15 0.64 1.80
Subtest/scale
(Continues)
TABLE 14.2 (Continued ) Age in years 5
6
7
8
9
10
11
12
13
14
15
16
Averageb
Symbol search (SEm) Mazes (SEm)
0.73 1.56 0.73 1.55
0.73 1.56 0.74 1.53
0.73 1.56 0.81 1.31
0.57 1.96 0.76 1.48
0.57 1.96 0.74 1.53
0.57 1.96 0.55 2.00
0.57 1.96 0.71 1.61
0.57 1.96 0.65 1.79
0.59 1.92 0.80 1.35
0.59 1.92 0.73 1.55
0.59 1.92 0.61 1.86
0.59 1.92 0.65 1.77
0.64 1.81 0.71 1.60
Verbal IQ (SEm) Performance IQ (SEm) Full scale IQ (SEm)
0.91 4.40 0.89 4.99 0.94 3.67
0.89 4.94 0.87 5.34 0.93 4.11
0.90 4.85 0.91 4.61 0.94 3.71
0.92 4.37 0.91 4.43 0.95 3.43
0.92 4.26 0.91 4.46 0.95 3.50
0.94 3.78 0.90 4.63 0.95 3.25
0.94 3.68 0.88 5.09 0.95 3.44
0.90 4.79 0.87 5.38 0.92 4.18
0.94 3.77 0.88 5.11 0.95 3.45
0.94 3.57 0.92 4.20 0.96 3.03
0.96 2.81 0.91 4.42 0.97 2.69
0.94 3.54 0.90 4.68 0.96 3.13
0.93 4.02 0.90 4.77 0.95 3.44
Verbal comp. index (SEm) Percept. organ. index (SEm) Freed. distract. index (SEm) Process. speed index (SEm)
0.89 5.07 0.87 5.36 0.89 4.93 0.83 6.24
0.87 5.47 0.85 5.87 0.86 5.53 0.83 6.27
0.87 5.46 0.90 4.73 0.88 5.27 0.84 6.05
0.90 4.80 0.90 4.74 0.86 5.56 0.83 6.19
0.90 4.72 0.89 4.99 0.91 4.60 0.83 6.18
0.93 4.01 0.88 5.27 0.86 5.61 0.83 6.23
0.93 3.85 0.86 5.70 0.86 5.56 0.82 6.34
0.90 4.85 0.83 6.09 0.81 6.52 0.82 6.36
0.93 4.01 0.88 5.30 0.90 4.68 0.77 7.20
0.93 3.95 0.93 3.99 0.91 4.45 0.75 7.44
0.96 3.09 0.92 4.27 0.92 4.21 0.77 7.27
0.93 3.96 0.90 4.78 0.91 4.56 0.76 7.37
0.84 6.09 0.89 4.99 0.88 5.09 0.81 6.58
Subtest/scale
a N = between 81–114. The reliability coefficients for all subtests except Coding and Symbol Search are split-half correlations corrected by the Spearman-Brown formula. For Coding and Symbol Search, raw-score test-retest correlations are presented for three age-bands. The coefficients for the IQ and factor-based scales were calculated with the formula for the reliability of the composite (Guilford, 1954); the values for the supplementary subtests (Digit Span, Mazes, and Symbol Search) were not included in these computations. The standard errors of measurement are reported in scaled-score units for the subtests, in IQ units for the Verbal, Performance, and Full Scale scores, and in index units for the Verbal Comprehension, Perceptual Organization, Freedom from Distractibility, and Processing Speed scores. b The average r was computed with Fisher’s z transformation. The average SEms were calculated by averaging the sum of the squared SEms for each age group and obtaining the square root of the results.
221
14. Japan
TABLE 14.3 Stability Coefficients of the Subtests, IQ Scales, and Factor-Based Scales (N = 84) Subtest/scale
r12
Averagesa
Information Similarities Arithmetic Vocabulary Comprehension Digit span
0.87 0.65 0.72 0.84 0.65 0.85
0.89 0.71 0.77 0.82 0.78 0.82
Picture completion Coding Picture arrangement Block design Object assembly Symbol search Mazes
0.67 0.76 0.56 0.71 0.55 0.65 0.56
0.74 0.83 0.54 0.84 0.69 0.75 0.58
Verbal IQ Performance IQ Full scale IQ
0.89 0.76 0.88
0.93 0.83 0.93
Verbal comprehension index Perceptual organization index Freedom from distractibility index Processing speed index
0.86 0.77 0.84 0.68
0.91 0.84 0.81 0.78
a Correlations were corrected for the variability of WISC-III scores on the first testing (Guilford & Fruchter, 1978).
TABLE 14.4 Intercorrelation of Subtest Scaled Scores Subtest/scale Information Similarities Arithmetic Vocabulary Comprehension Digit span Picture completion Coding Picture arrangement Block design Object assembly Symbol search Mazes
Inf
Sim
Ari
Voc Com
0.54 0.54 0.62 0.51 0.39
0.46 0.60 0.49 0.52 0.44 0.62 0.33 0.45 0.36 0.29
DS
PCom
0.39 0.40 0.33 0.35 0.38 0.23 0.30 0.31 0.34 0.29 0.29 0.27
0.22
0.38 0.40 0.34 0.34 0.20
0.39 0.43 0.43 0.31 0.21
0.32 0.39 0.29 0.31 0.19
0.36 0.49 0.34 0.39 0.25
0.33 0.36 0.32 0.31 0.17
0.33 0.35 0.31 0.31 0.17
0.20 0.37 0.24 0.32 0.18
Cd
0.31 0.38 0.27 0.55 0.22
PA
BD
OA
SS
0.40 0.34 0.52 0.36 0.44 0.30 0.23 0.37 0.22 0.22
Mz
222
Kazuhiko Ueno and Ichiro Nakatani
TABLE 14.5 Maximum-Likelihood Factor Loadings (Varimax Rotation) for Four Factors Factor 1: Verbal comprehension
Factor 2: Perceptual organization
Factor 3: Freedom from distractibility
Factor 4: Processing speed
Information Similarities Vocabulary Comprehension
0.64 0.63 0.79 0.66
0.25 0.25 0.16 0.24
0.15 0.17 0.13 0.17
0.31 0.18 0.20 0.10
Picture completion Picture arrangement Block design Object assembly
0.35 0.29 0.17 0.21
0.57 0.43 0.64 0.58
0.11 0.26 0.26 0.13
0.02 0.08 0.38 0.14
Arithmetic Digit span
0.42 0.27
0.28 0.16
0.21 0.19
0.52 0.48
Coding Symbol search
0.18 0.18
0.18 0.28
0.70 0.63
0.14 0.20
Mazes
0.06
0.32
0.16
0.21
This pattern of correlations provided evidence of convergent validity. Lower correlations between Verbal subtests and Performance subtests are evidence of discriminant validity. For the factor analysis, the principal factor method was employed using data from all 1125 subjects of the standardization sample. The results indicated that for all subtests except Mazes, loadings for each factor with the one-factor solution were at least 0.51. A second factor analysis was conducted with four factors and a Varimax rotation. Table 14.5 presents the maximum-likelihood factor loadings with Varimax rotation for the four factors. Analyses for the total sample (N = 1125) justify a four-factor structure. Factor 1, composed of the Information, Similarities, Vocabulary, and Comprehension subtests, is the Verbal Comprehension factor. Factor 2, composed of the Picture Completion, Picture Arrangement, Block Design, and Object Assembly subtests, is the Perceptual Organization factor. Factor 3, composed of the Cording and Symbol Search, is the Processing Speed. Factor 4, composed of the Arithmetic and Digit Span, is Freedom from Distractibility. Thus, the Japanese version obtains the same four-factor structure as the U.S. test.
CULTURAL ISSUES INFLUENCING THE WISC-III IN JAPAN Japan is a highly homogeneous culture. For example, conditions, such as the standardized language of Japanese, cultural homogeneity influenced by the development of mass media, the unified educational system (including the national
223
14. Japan
curriculum, the commonly used textbooks, and the nationwide teachers’ training standardization), create the nationwide cultural homogeneity with few ethnic groups within the country. The items used in Vocabulary, Information, and Comprehension and in Picture Completion had to be substituted or added from the point of view of content valid to Japanese education, history, and culture. Therefore, they are not necessarily the same items as those used in the U.S. test. As the factor analysis shows, however, the construct validity of index scores in WISC-III Japanese version indicates that its structure is the same as the U.S. test. Though this is an important chapter in which to examine the cultural differences in WISC-III, unfortunately there are not many cross-cultural studies of these differences at the present time. It is known, however, that Japanese children show high test performances on the speed tests, including Coding, Symbol Search, Picture Arrangement, Block Design, and Mazes. Therefore, there appeared to be a ceiling effect in item response and few items with high degree of difficulty. Thus, the time limit for some subtests and some items were shortened (e.g., Coding and Symbol Search, which were originally 120 seconds were shortened to 90 seconds), and more difficult items were added to some subtests (e.g., Block Design). In developing the Japanese version of the WISC-III, we found a very interesting cultural difference. Japanese show a much higher rate of performance in Coding, which is a performance subtest that measures the amount of performance within a certain period of time. Therefore to avoid the ceiling effect, the operation time was shortened from 120 seconds to 90 in standardizing WISC-R. In the standardization of the WISC-III, the same method was adopted for Coding and Symbol Search. Therefore, we should not compare the results of these tests simply in terms of international comparison. Since Japanese are characterized by better performance on these speed-type tests in the performance subtests, the comparative analysis should be done considering that the operation time was revised and shortened.
PROFESSIONAL ISSUES IN THE USE OF INTELLIGENCE TESTS IN JAPAN Concerning the educational use of psychological tests in Japan, especially intelligence tests, we have experienced some difficultly. In 1979, the Education for All Handicapped Children Act came into effect. On the one hand, in the U.S. the educational measures for children with first-degree disabilities, especially with learning disabilities, were rapidly provided and disseminated, while in Japan attention was focused mostly on support for the very disabled. Special education in the required curriculum was limited to children with second-degree and heavier disabilities, which limited the rate of school attendance of these children to 1.3% of the total number of students. Under such conditions, psychological tests were mainly used in medical clinics, and schools showed adverse attitudes toward the tests.
224
Kazuhiko Ueno and Ichiro Nakatani
However, the decision to take measures regarding learning and other disabilities in the public educational system started in 1990. In addition, the visiting consultation project in 1995 and the learning disabilities educational measures project in 2000 were initiated as model projects nationwide. These projects are still in a trial stage in each prefecture and their eventual adoption is still several years away. For this model project, each prefecture’s educational committee has a team of experts that includes not only members from the education sector but also psychologists and physicians. The nationwide application of the project has demonstrated the need for the utilization of intelligence tests for making decisions about the educational measures for children with first-degree intellectual disabilities and learning disabilities. Currently, there is no official or legally qualified status or positions for school psychologists, but only a “collaboratively authorized qualification” certified by academic societies exists. The qualifications for school psychologists were established in 1997 in collaboration with five academic societies: The Japanese Association of Educational Psychology, the Japanese Association of Special Education, the Japanese Academy of Learning Disabilities, the Japan Society of Developmental Psychology, and the Japanese Association for the Study of Developmental Disabilities. As discussed above, with the rapid increase of concern for the training of children with firstdegree developmental disabilities including learning disabilities, the environment for the full-scale utilization of psychological tests in the educational field has been facilitated. This trend can be seen as a resurgence of the value of intelligence tests, and indicates progressive use of psychological tests “from the clinic to the classroom.” We predict that we are beginning a totally new era concerning the utilization of the WISC-III and related tests in the field of education in Japan. In the future, with expansion of educational measures for lightly disabled children, we can expect the legal provision and maintenance for those qualifications to be improved.
REFERENCES Guilford, J. P. (1954). Psychometric methods. (2nd ed.) New York: McGraw-Hill. Guilford, J. P., & Fruchter, B. (1978). Fundamental statistics in psychology and education. (6th ed.) New York: McGraw-Hill. Hayashi, S. (1971). Vocabulary research and basic vocabulary. Report of the National Institute for Japanese Language 39. Japan Educational Research Institute (1985). Studies on the basic vocabulary in school learning I. Tokyo, Japan: Kyoiku-shuppan. Ministry of Education (1995). School basic survey. Tokyo, Japan: Printing Bureau, Ministry of Finance. National Institute for Japanese Language (1980). Vocabulary of infants. Tokyo, Japan: Tokyo-shoseki. Okubo, A., & Kawamata, R. (1982). Vocabulary for under school age. Report of the National Institute for Japanese Language 71.
14. Japan
225
Sakamoto, I. (1959). Shin Kyoiku Kihon Goi (New basic vocabulary in education). Tokyo, Japan: Gakugei-tosho. Statistics Bureau of the Management and Coordination Agency (1992). Population census of Japan. Tokyo, Japan: Japan Statistical Association. Wechsler, D. (1991). Wechsler Intelligence Scale for Children—Third Edition. San Antonio, TX: The Psychological Corporation. Wechsler, D. (1998). Wechsler Intelligence Scale for Children—Third Edition; Japanese Edition. (H. Azuma, K. Ueno, K. Fujita et al., Trans.). Tokyo, Japan: Nihon Bunka Kagakusha. (Original work published in 1991.)