INDUSTRIAL VISION T E C H N I Q U E S * HENRY
A.
H ( S ) , USNR Alexandria, Virginia
IMUS, L T . COMDR.
INTRODUCTION
The emphasis upon vision testing in industry has shifted from primary concern for first aid following accidents to selection and classification of personnel for job placement. The value of this new approach has been demonstrated to labor and management. Better placement in industrial jobs has resulted in greater job satisfaction, increased earning power, as well as promotion and advancement in his craft for the individual worker. These results have been reflected in the reduction of overhead, wastage, and inefficient operation, which are of vital importance to management. The greater productivity of the worker, also, tends toward greater profits for industry. Finally, the detection of abnormal ocular or visual conditions is important for the health and safety of the worker. Vision testing, which is aimed primarily at the selection and classification of personnel for special visual tasks in industry, must be administered quickly by trained technicians rather than by ophthalmologists. The development of machine vision-testing equipment has made possible the routine administration of visual tests by technicians, who do not need to know anything about the usual clinical tests of vision, and are not required to make any interpretation of the results. The machine tests are contained in a single instrument, are administered according to a standardized procedure, and are scored according to arbitrary scales. From the Research Division, Bureau of Medicine and Surgery, Navy Department. Opinions or conclusions contained in this report are those of the author. They are not to be construed as necessarily reflecting the views or the endorsement of the Navy Department or the Naval Service at large. Reference may be made to this report in the same way as to published articles, noting author, title, source, date, project number, and report number.
(INACTIVE)
For classification and assignment, the comparison of scores on the visual tests with criteria of performance on the job, whether these be foreman's ratings, units produced, breakage, or earnings, determine the profile of "cut-off" scores for a particular job. Usually, the personnel department is responsible for the testing of the applicants, the validation against criteria of performance, and the establishment of the profile scores. The industrial medical department, by reference to a table of clinical equivalents, may establish certain minimum scores for referral of patients to the clinic or to outside practitioners for evaluation and treatment of the ocular condition suggested by the tests. For job placement, it is not necessary that visual screening tests predict the results of the corresponding clinical tests. If the test predicts performance on the job, that is sufficient. However, from the clinical and safety points of view, as well as for selecting procedures for rapid military mobilization, it is desirable that the machine visual tests predict with fair accuracy the clinical measures of vision. In order to determine the reliability and validity of the current industrial vision screening devices, a number of research projects were established in the military services.14 The first test of the Ortho-Rater was conducted at Fort Eustis, Virginia, an antiaircraft artillery replacement training center, in connection with the selection of stereoscopic rangefinder operators, during the spring of 1943. Although the results of this study have not been published, the prediction of visual acuity at the level of 20/20 vision was found to be very good. The next study was conducted at the U. S. Naval Training Center, Sampson, New York, in connection with the selection of 145
146
HENRY A. IMUS
stereoscopic rangefinder operators for the Radar and Rangefinder School, Fort Lauderdale, Florida. A test-retest analysis15 was made of the results of 234 administrations bf the Ortho-Rater by 26 newly trained examiners. These examiners were optometrists enlisted in the Hospital Corps of the U. S. Navy. The results of this study, which was concerned only with reliability (the coefficients of reliability varied from 0.49 to 0.83), emphasized the importance of careful training of the technicians who are to operate the machine tests. The standard instructions16 to the examinee must be memorized perfectly, and each examinee must be encouraged to do his best after the examiner is convinced that he understands the directions given on each subtest. It is important that each examinee be treated in the same way. These two preliminary tests showed that the Ortho-Rater could be used satisfactorily for mass testing in the military situation. The validation of the Ortho-Rater test of depth perception for the selection of stereoscopic rangefinder operators has been demonstrated adequately.17 At the U. S. Naval School of Aviation Medicine and Research, Pensacola, Florida, a test-retest study 12 ' 13 ' 18 of the Bausch & Lomb Ortho-Rater and of the corresponding clinical tests was made in 1945. This study shows, in general, that the visual tests incorporated in the Ortho-Rater are as consistent in the measures obtained as the clinical tests. The measures of far visual acuity agree satisfactorily with those obtained clinically. Measures of heterophoria do not agree closely, but, in view of the variability of heterophoria itself, the relationship may be as close as can be expected for any two measures of this anomaly. The Medical Field Research Laboratory, Camp Lejeune, North Carolina, also reported on a comparative study5' β of visual screening devices. The most extensive studies7-11· 1 β · 2 0 of the
three available commercial screening devices were conducted at the U. S. Naval Medical Research Laboratory, Submarine Base, New London, Connecticut. Some of the statistical analyses of these data were undertaken by The Adjutant General's Office in connection with their intensive investigation of tests of visual acuity at Fort Dix, New Jersey, in cooperation with the Army-Navy-NRC Vision Committee. 1 ' 2 These statistical analyses are incorporated in a report 3 issued by The Adjutant General in 1947. DISCUSSION OF RESULTS
The results of the evaluation of the various machine vision-testing devices, which are discussed in this paper, have been taken from the various aforementioned reports and the sources are acknowledged in the various tables or charts and in the list of references. The test-retest reliability of the various instruments is indicated by the coefficients of correlation presented in Table 1. In general, the coefficients of this table show that the Ortho-Rater provides the most consistent measures and the Telebinocular the least consistent. For far and near lateral phoria, the coefficients range from 0.75 to 0.92, and the instruments are more nearly equivalent. It should be noted that the reliabilities of the clinical measures are of the same order of magnitude. For monocular far acuity, the OrthoRater and Sight-Screener are practically equal in consistency, the coefficients ranging from 0.81 to 0.90. The Telebinocular is only slightly less good in this aspect with coefficients ranging from 0.78 to 0.86. Only in the Pensacola study did the clinical measure of far monocular acuity appear to be more reliable than a machine test. For the near tests of acuity, both binocular and monocular, the consistency of the Ortho-Rater is noticeably greater than that of the other two instruments. The reliabilities of measures of far ver-
147
INDUSTRIAL VISION TECHNIQUES TABLE 1 COEFFICIENTS OF RELIABILITY OF TESTS*
Test Far vertical phoria Far lateral phoria Binocular far Monocular far Depth Binocular near Monocular near Near vertical phoria Near lateral phoria
Ortho-Rater
Sight-Screener
Telebinocular
Clinical
0.79 0.87 0.88-0.93 0.81-0.90 0.83 0.84-0.87 0.80-0.90 0.73 0.81-0.92
0.61 0.80 0.70 0.84 0.57 0.70 0.77 0.55 0.83
0.63 0.75
0.64 0.81 0.81-0.97 0.80-O.97 0.62-0.72t 0.67 0.75-0.78 0.74 0.90
—
0.78-0.86 0.79 0.72 0.71
—
0.85
* The data for this Table were selected from the Pensacola (N = 100), New London (N = 128) and the Adjutant General's Office Report No. PRS No. 742. See references 3, 7-13. t The Howard-Dolman test.
tical phoria by the Sight-Screener, Telebinocular, and clinical tests are equivalent, 0.61 and 0.64, while that of the Ortho-Rater is much higher, 0.79. For near vertical phoria, the reliabilities of the Ortho-Rater and clinical measures are equivalent, 0.73 and 0.74, while that for the Sight-Screener is much lower, 0.55. In the measures of depth perception, the Ortho-Rater and Telebinocular are nearly14 equivalent as to reliability, 0.83 and 0.79, whereas the Sight-Screener and HowardDolman test are somewhat less consistent, 0.57 and 0.62 to 0.72. The results of all tests on the instruments were compared with the clinical ophthalmic tests. The coefficients of correlation are presented in Table 2. For far and near monocular acuity, the
coefficients for the Ortho-Rater and SightScreener are approximately 0.75, while those for the Telebinocular are approximately 0.55. The Ortho-Rater appears to provide a much better measure of far lateral phoria, r = 0.57 to 0.70, as compared with the other two instruments, r = 0.37 for each. For near lateral phoria, however, the three instruments are practically equivalent: Ortho-Rater, 0.67 to 0.77; Sight-Screener, 0.54; and Telebinocular, 0.68. For all instruments the coefficients of correlation for vertical phoria are relatively low, ranging from 0.29 to 0.50. The only comparison in measures of depth perception is between the Ortho-Rater and the Howard-Dolman test. These coefficients (0.59 to 0.62) are no greater than those
TABLE 2 RELATIONSHIPS* BETWEEN INSTRUMENTS AND CLINICAL OPHTHALMIC EXAMINATION
Test Far vertical phoria Far lateral phoria Binocular far Monocular far Depth Binocular near Monocular near Near vertical phoria Near lateral phoria
Ortho-Rater
Sight-Screener
Telebinocular
0.29-0.49 0.57-0.70 0.79-0.90 0.72-0.84 0.59-0.62t 0.70 0.75 0.34-0.50 0.67-0.77
0.28 0.37 0.71 0.74
0.43 0.37
0.64 0.71 0.34 0.54
0.55 0.55
—
—
0.58
—
—
0.68
* The data for this Table were selected from the Pensacola report (N = 100), New London report (N = 128) and The Adjutant General's Office Report No. PRS No. 742. See references 3, 7-13. t Howard-Dolman test.
148
HENRY A. IMUS TABLE 3 FACTOR LOADINGS OF FAR VISUAL ACUITY TESTS*
Instrument
Resolution
Accommodation
Form
Interference
Snellen Chart
0.70
0.86
0.00
0.12
0.24
0.50
0.01
0.11
Ortho-Rater
0.77
0.86
-0.07
0.07
0.00
0.11
0.23
0.52
Sight-Screener
0.75
0.90
0.06
0.20
0.02
0.26
0.06
0.18
Telebinocular
0.62
0.72
0.15
0.25
0.02
0.07
0.01
0.23
* The Adjutant General's Office PRS Report No. 742, August, 1947, p. 54.
generally found between any two tests of depth perception.* A statistical study of the New London data8 on the comparison of three commercial visual screening devices was conducted by The Adjutant General's Office, using the method of factor analysis. The data of Tables 3 to 7 have been extracted from this report.3 For the purpose of this paper a factor loading can be described as the coefficient of correlation between the test and a specific factor. The square of the factor loading represents the proportion of the total variability of the specific test scores which is explained by the factor. This factor is named after careful consideration of the character of the test and the distribution of all factor loadings. In Table 3 it is shown that the factor loadings for resolution in the far visual * Unpublished report, "Comparison Between Seven Tests of Depth Perception," Fort Eustis, Virginia.
acuity test for the Snellen Chart, the OrthoRater, and Sight-Screener are of the same order of magnitude, while those for the Telebinocular are slightly lower. It is shown, further, that a part of the variability of the test scores on the Snellen Chart and SightScreener can be attributed to a form factor which is not effective with the other two instruments. The variability of scores on the Ortho-Rater can be attributed in part to a specific interference or machine factor. This might be eliminated by an improvement of the instrument or test targets. The Telebinocular shows both accommodation and machine factors, and represents the least pure test of retinal resolution. For the near-vision tests the factor loadings are shown in Table 4. Here again the Snellen, Ortho-Rater, and Sight-Screener are practically equivalent as to factor loadings for resolution. The loadings on the Telebinocular are much lower for both letter and circle targets. As would be expected, an accommodation factor appears to be
TABLE 4 FACTOR LOADINGS OF NEAR VISION TESTS* Instrument
Resolution
Accommodation
Form
Interference
Reduced Snellen
0.53
0.60
0.42
0.66
0.19
0.40
0.00
0.15
Ortho-Rater
0.54
0.70
0.57
0.70
0.02
0.14
0.16
0.37
Sight-Screener
0.59
0.69
0.39
0.53
0.01
0.17
0.00
0.23
Telebinocular (letters)
0.41
0.43
0.37
0.47
0.19
0.26
0.02
0.07
Telebinocular (circles)
0.38
0.56
0.39
0.50
0.03
0.18
0.02
0.26
* The Adjutant General's Office PRS Report No. 742, August, 1947, p. 54.
INDUSTRIAL VISION TECHNIQUES
149
TABLE S FACTOR LOADINGS o r VERTICAL PHORIA TESTS*
Instrument
Fusion
Vertical Phoria 0.43 0.20
Specific 0.13 0.09
-0.27 -0.33
Maddox rod
Far Near
Ortho-Rater
Far Near
0.50 0.49
0.73 0.73
-0.05 -0.04
-0.12 0.05
Sight-Screener
Far Near
0.40 0.47
0.57 0.54
0.48 0.57
0.78 0.62
Telebinocular
Far Near
0.41
0.49
0.10
0.17
0.22 0.71 0.76
—
—
* The Adjutant General's Office PRS Report No. 742, p. 60.
present in all tests. The Ortho-Rater scores are affected slightly more by this factor as compared with the other devices. A form factor is present in both the Snellen and Telebinocular letters, which would be expected, but these two instruments present the lowest (practically zero) interference factors. The factor loadings on the vertical phoria tests for distant and near vision on all devices are summarized in Table 5. Two factors, vertical phoria and so-called "fusion," are revealed in this analysis for all tests, while a specific instrument factor appears for the Ortho-Rater only, especially in the test at the near distance. For the tests of far vertical phoria with the Maddox rod, Sight-Screener, and Telebinocular, the factor loadings for vertical phoria are relatively low and practically
equivalent. The Ortho-Rater test shows the highest factor loadings for vertical phoria. On the other hand, the Ortho-Rater shows the lowest factor loadings for "fusion" with the Telebinocular a close second. The variability of the Sight-Screener scores is greatly affected by this factor. For the tests of vertical phoria at near vision, a large part of the variability of the Ortho-Rater scores is accounted for by a specific machine factor, peculiar to this instrument. This finding points to a definite area in which the Ortho-Rater might be improved. With the high loading already present on vertical phoria, elimination of the machine factor might greatly increase the efficiency of this instrument. The Telebinocular does not have a test of near vertical phoria. As in the far vertical phoria test the Sight-Screener scores for
TABLE 6 FACTOR LOADINGS OF LATERAL PHORIA TESTS*
Instrument
Lateral Phoria
Fusion
Specific
0.60 0.58
0.39 0.41
—
Maddox rod
Far Near
Ortho-Rater
Far Near
0.81 0.65
0.92 0.68
0.03 0.06
-0.10 0.09
Sight-Screener
Far Near
0.71 0.47
0.75 0.49
-0.04 0.00
-0.08 0.01
—
—
Telebinocular
Far Near
0.71 0.57
0.77 0.70
0.03 0.02
-0.04 -0.05
—
—
* The Adjutant General's Office PRS Report No. 742, August, 1947, p. 60.
0.09 0.17 O.OS 0.11
ISO
HENRY A. IMUS
near vertical phoria are greatly influenced by the "fusion" factor. The near Maddoxrod scores are affected, also, by this factor, and present the lowest factor loadings for near vertical phoria. For the lateral phoria tests on all instruments, the factor loadings for both far and near vision are presented in Table 6. For far lateral phoria loadings, the devices are ranked as follows : Ortho-Rater highest, 0.81 to 0.92; Sight-Screener and Telebinocular close second, 0.71 to 0.77; and Maddox rod third with 0.60. The Maddox rod is the only test affected significantly
SUMMARY
1. In general, the machine tests are as reliable, or are more reliable, than the corresponding clinical tests. 2. For monocular far acuity, the coefficients of reliability are practically the same for all tests. The coefficients range from 0.78 to 0.97. 3. For monocular near acuity, the consistency of the Ortho-Rater measures is slightly greater (r = 0.80 to 0.90) than for the other devices (r = 0.71 to 0.78). 4. For binocular acuity at both far and
TABLE 7 FACTOR LOADINGS OF DEPTH PERCEPTION TESTS*
Instrument
Depth
Form
Interference
Ortho-Rater
0.61
0.69
0.04
0.11
Sight-Screener
0.43 0.44
0.03
0.11
Telebinocular
0.32
-0.13
-0.17
:
0.35
-0.02
—
0.03
0.03 0.06 0.21
Specific
0.30
0.31
0.38
—
The Adjutant General's Office PRS Report No. 742, August, 1947, p. 54.
by the "fusion" factor, while the OrthoRater presents a slight loading on a specific instrument factor. For near lateral phoria loadings, the Ortho-Rater again ranks the highest, 0.65 to 0.68, with the Telebinocular second, 0.57 to 0.70, Maddox rod third, 0.58, and the Sight-Screener the lowest, 0.47 to 0.49. Again the Maddox-rod scores are the only ones affected, significantly, by the "fusion" factor. There is a very slight machine factor affecting the Ortho-Rater scores. The factor loadings of the tests of depth perception are presented in Table 7. The Ortho-Rater has the highest factor loading for depth, 0.61 to 0.69, while the SightScreener and Telebinocular have relatively low loadings on this factor, 0.43 to 0.44 and 0.32 to 0.35, respectively. The Sight-Screener scores are affected by a specific machine factor, while those of the Telebinocular are affected somewhat by both form and interference factors.
near distances, the reliability of the OrthoRater is the highest (r = 0.80 to 0.93) of the three machine tests. 5. The reliability of the clinical test of binocular acuity is equivalent to that of the Ortho-Rater for far vision (r = 0.81 to 0.97), but is lowest of all for near vision (r = 0.67). 6. For far vertical phoria, the OrthoRater is most consistent (r = 0.79), while the other three devices are equally less consistent (r = 0.61 to 0.64). 7. For near vertical phoria, the OrthoRater and clinical measures are equivalent in reliability (r = 0.73 and 0.74), while the Sight-Screener is much lower ( r = 0.55). 8. Except for a slight advantage of the Ortho-Rater on far lateral phoria, all tests are equally consistent for both far and near lateral phoria. 9. The Ortho-Rater test of depth perception is the most reliable (r = 0.83), followed by the Telebinocular (r = 0.79),
INDUSTRIAL VISION TECHNIQUES Howard-Dolman (r = 0.72), and SightScreener (r = 0.57). 10. The validity coefficients of the tests of monocular acuity for both far and near vision are practically equivalent for the OrthoRater and Sight-Screener (r = 0.71 to 0.84), as compared with the Telebinocular (r = 0.55 to 0.58). 11. For binocular acuity, both far and near, the validity coefficients for the OrthoRater are the highest (r = 0.90 to 0.70), for the Sight-Screener next (r = 0.71 to 0.64), and for the Telebinocular lowest (near only) (r = 0.55). 12. For vertical phoria, both far and near, the validity coefficients are quite low for all devices (r = 0.29 to 0.50). 13. For far lateral phoria, the validity coefficients for the Ortho-Rater are much higher (r = 0.57 to 0.70), than for the other two devices (r = 0.37) each. 14. For near lateral phoria the validity coefficients of the Ortho-Rater and Telebinocular are practically equivalent (r = 0.67 to 0.68), while that for the SightScreener is somewhat lower (r = 0.54). 15. In the factor analysis of far visual acuity, the Sight-Screener and Ortho-Rater are equivalent to the Snellen chart on the factor of resolution; whereas, the Telebinocular loadings are lower on this factor. Significant form factors are revealed in the Snellen and Sight-Screener tests, while accommodation and interference factors affect the Telebinocular test and a specific machine factor affects the Ortho-Rater test. 16. For near vision, the factor loadings for resolution are lowest for "the Telebinocular while for the other three devices the loadings are equivalent. A factor of accommodation affects all tests of near vision, but affects the Snellen and Ortho-Rater tests more than the others. A form factor is present in the Snellen and Telebinocular tests of near vision, while a specific machine factor is present in the Ortho-Rater test. 17. Although the Ortho-Rater shows the
151
highest loading for vertical phoria for both far and near, it suffers from a specific instrument factor, especially at near. On the other hand, it reveals the lowest factor loading for "fusion." 18. The Ortho-Rater leads all instruments in factor loadings for lateral phoria, both far and near. All machine tests show higher factor loadings for lateral phoria than the Maddox Rod test. The latter is affected, also, by the "fusion" factor. 19. The Ortho-Rater presents the highest factor loadings for depth perception. The other two devices are affected by form, interference, or specific machine factors. CONCLUSIONS
1. It is evident that a machine test of visual factors can be used to predict clinical factors with a fair degree of accuracy and consistency. 2. For visual acuity measures, the OrthoRater is slightly more reliable and valid than the other devices. 3. For vertical phoria, the Ortho-Rater is most consistent, but the validity coefficients for all devices are relatively low. 4. For far lateral phoria, the Ortho-Rater is slightly more consistent and is definitely more valid than the other devices. 5. In the measurement of far visual acuity, the Ortho-Rater and Sight-Screener are equivalent as to the factor of resolution, but all three instruments are adversely affected by one or more other factors. 6. All tests for near vision are affected by factors other than resolution. 7. The Ortho-Rater presents the highest factor loading for vertical phoria, but suffers from a specific machine factor. 8. The Ortho-Rater presents the highest factor loading for lateral phoria. 9. The Ortho-Rater presents the highest factor loading for depth perception and is least affected by other factors. 907 Crescent Drive (25).
152
HENRY A. IMUS REFERENCES
U. S. ARMY
1. The Adjutant General's Office, Personnel Research Section. Vision examination. Project PR-4075 Interim Progress Report of 1 July, 1946. 2. The Adjutant General's Office, Personnel Research Section. Technical conference on vision examination. IS November, 1946. 3. The Adjutant General's Office, Personnel Research Section. Studies in visual acuity. PRS Report No. 742, August, 1947. 4. Army Air Force, School of Aviation Medicine, Randolph Field, Texas. Comparison of results of sight screener and clinical tests. Project 480, 4 September, 1946. U. S. NAVY
5. Medical Field Research Laboratory, Camp Lejeune, N.C. Comparative study of screening devices for visual selection of naval personnel. Bureau of Medicine & Surgery Research Project No. X-471 (Av-247-p), Mueller and Richmond, 22 May, 1946. 6. Medical Field Research Laboratory, Camp Lejeune, N.C. Study of visual acuity targets. Bureau of Medicine & Surgery Research Project No. X-671 (Av-353-p), Mueller and Richmond, 28 May, 1946. 7. Medical Research Laboratory, Submarine Base, New London, Conn. Comparison of various screening devices with standard medical visual procedure. Progress Report No. 1 of Bureau of Medicine & Surgery Research Project No. X-493 (Av-263-p), Sulzman, Farns worth, Cook et al., November, 1945. 8. Medical Research Laboratory, Submarine Base, New London, Conn. Visual acuity measurements with three commercial screening devices. Progress Report No. 2 of Bureau of Medicine & Surgery Research Project No. X-493 (Av-263-p), Sulzman, Cook, and Bartlett, February, 1946. 9. Medical Research Laboratory, Submarine Base, New London, Conn. Comparative measures of heterophoria. Progress Report No. 3 of Bureau of Medicine & Surgery Research Project No. X-493 (Av-263-p), Sulzman, Cook, and Bartlett, February, 1946. 10. Medical Research Laboratory, Submarine Base, New London, Conn. Visual acuity measurements with three commercial screening devices. Revised edition of Progress Report No. 2 of Bureau of Medicine & Surgery Research Project No. X-493 (Av-263-p), Cook, May, 1948. 11. Medical Research Laboratory, Submarine Base, New London, Conn. A factor analysis study of visual acuity and phoria data collected by the medical research laboratory. Progress Report No. 4 of Bureau of Medicine & Surgery Research Project No. X-493 (Av-263-p), Cook, May, 1948. 12. Naval Air Training Bases, U. S. Naval Air Station, Pensacola, Florida. Comparison of OrthoRater with clinical ophthalmic examinations. Report No. 1 of Bureau of Medicine & Surgery Research Project No. X-499 (Áí-268-ñ), Wolpaw and Imus, 29 September, 1945. 13. Naval Air Training Bases, U. S. Naval Air Station, Pensacola, Florida. Comparison of OrthoRater with clinical ophthalmic examinations. Report No. 2 of Bureau of Medicine & Surgery Research Project No. X-499 (Av-268-p), Imus, 1 March, 1946. 14. Naval Air Training Bases, U. S. Naval Air Station, Pensacola, Florida. A comparison of the reliability and validity of visual acuity test targets. Bureau of Medicine & Surgery Research Project No. X-676 (Av-367-p), Clark, 3 April, 1946. OFFICE OF SCIENTIFIC RESEARCH AND DEVELOPMENT
15. Applied Psychology Panel : A test-retest reliability study of the Bausch and Lomb Ortho-Rater with naval personnel. OSRD Report No. 3969, August, 1944. 16. Imus, H. A.: Manual for use in the selection of fire controlmen (O). Office of Scientific Research and Development, Report No. 4050,1944. 17. Beier, D. C, et al. The selection of fire controlmen (O), rangefinder and radar operators. Office of Scientific Research and Development, 1945. Publ. Bd. No. 18327, Washington, D.C., U. S. Department of Commerce, 1946. SCIENTIFIC JOURNAL ARTICLES
18. Imus, H. A. : Comparison of Ortho-Rater with clinical ophthalmic examinations. American Psychol., 30:283-284, 1946. 19. Sulzman, J. H., Cook, E. B., and Bartlett, N. R. : The reliability of visual acuity scores yielded by three commercial devices. J. Applied Psychol., 31:236-240, 1947. 20. Sulzman, J. H„ Cook, E. B., and Bartlett, N. R. : The validity and reliability of heterophoria scores yielded by three commercial optical devices. J. Applied Psychol., 32:56-62, 1948.