Work Group III: Methodologic Issues in Research on the Food and Physical Activity Environments Addressing Data Complexity J. Michael Oakes, PhD, Louise C. Mâខ sse, PhD, Lynne C. Messer, PhD Abstract:
Progress in transdisciplinary research addressing the health effects of the food and physical activity environments appears hampered by several methodologic obstacles, including: (1) the absence of clear, testable conceptual models; (2) slow adoption of practicable, rigorous research designs; (3) improper use of analytic techniques; and (4) concerns about ubiquitous measurement error. The consequence of such obstacles is that data collected as part of the typical study are more complex than need be. We offer diagnoses and recommendations from an NIH-sponsored meeting that addressed core issues in food- and physical activity–environment research. Recommendations include improved conceptual models and more elaborate theories, experimental thinking and increased attention to causal effect estimation, adoption of cross-validation techniques, use of existing measurement-error models, and increased support for methodologic research. (Am J Prev Med 2009;36(4S):S177–S181) © 2009 American Journal of Preventive Medicine
Introduction
T
he relationship between food, physical activity environments, and health behavior is an important area of research. Of note is the growing transdisciplinary nature of this work, which draws on fruitful insights from epidemiology, nutrition, public policy, transportation, urban planning and design, as well as the behavioral and social sciences. The 2007 Measures of the Food and Built Environments workshop, sponsored by the NIH and the Robert Wood Johnson Foundation (RWJF), included four work groups that deliberated on various aspects of these issues.1 Although a growing body of literature suggests that the food and physical activity environments may influence nutrition behaviors and physical activity, this research appears hampered by several methodologic obstacles, including a lack of clear conceptual models, the slow adoption of rigorous research designs, limited use of proper analytic techniques, and ubiquitous measurement error. Not surprisingly, the impact of such obstacles is that data collected as part of a typical study investigating the food and physical activity environments are unnecessarily complex, a term we use to describe data sets with a very large number of highly correlated manifest variables collected at different geosocial levels measured
From the Division of Epidemiology, University of Minnesota (Oakes), Minneapolis, Minnesota; the Department of Pediatrics, University of British Columbia (Mâខ sse), Vancouver BC, Canada; and the Center for Health Policy, Duke University (Messer), Durham, North Carolina Address correspondence and reprint requests to: J. Michael Oakes, PhD, Division of Epidemiology, University of Minnesota, 1300 South 2nd Street, Minneapolis MN 55454. E-mail:
[email protected].
over various time periods and complicated if not unknowable (causal) relationships. This paper, which builds on the discussions of Work Group III, addresses the issues of data complexity in food- and physical activity– environment research and offers recommendations for overcoming these obstacles, principally by avoiding them all together.
Challenges The fundamental objective of food- and physical activity– environment research is simple: Identify characteristics of the food and physical activity environments that affect health outcomes and determine which components may be altered so as to induce behavior change and ultimately improve health. The problem lies in putting this objective into practice, which is far from simple due to highdimension multilevel structures and endogenous effects.2– 6 Many questions naturally follow from any attempt to measure these environments: What should I measure? How should I measure it? How should I analyze the data? And, what can I credibly infer? Take, for example, the hypothesized effect of poor sidewalks on a person’s propensity to walk or exercise. Although seemingly simple, it is not clear whether the width, surface type, over brush, or upkeep of the sidewalk or its transitions to or connectedness with other walking surfaces is most important. It also is not clear whether a potential user’s perception of the sidewalk matters more than verifiable objective measures of it. And all of this says nothing about whether interrelationships (i.e., moderators and mediators) with other factors such as weather, season, criminal activity, other
Am J Prev Med 2009;36(4S) © 2009 American Journal of Preventive Medicine • Published by Elsevier Inc.
0749-3797/09/$–see front matter S177 doi:10.1016/j.amepre.2009.01.015
walkers, or related phenomena affect walking behavior. Nor does it consider that these relationships may vary across levels of urbanization or other attributes of the environment. It follows that the seemingly simple question of how to evaluate a sidewalk’s impact on walking behavior is complex. Consequently, any data collected to answer the question tend to become complex, too. It is critically important to mitigate, if not overcome, data complexity and related methodologic issues. Not doing so risks wasting substantial resources on inferior research and perhaps even diluting the public trust in food- and physical activity– environment research. Additionally, policy change based on faulty results may actually produce negative consequences that hamper, rather than help, the public’s health. It is for these and related reasons that the data-complexity issue merits serious attention.7 Many researchers may see the data-complexity problem as one that could be addressed through various statistical methods, for example, factor analysis, cluster analysis, index development, neural networks, or data mining. We do not, at least to begin with. The reason for this rests with the size of many data sets and the inherent correlations between measures that result from social stratification. Inferential procedures that are based on sample sizes and correlations are bound to mislead. By contrast, we view the problem of data complexity as a consequence of conceptualization and related “upstream” research phenomena, as described below.
Challenge #1: Conceptual Models and Theories The first barrier is surely the lack of precise conceptual models and elaborate theories of exactly which factors are presumed to affect which behaviors under which circumstances and by how much. This is especially important in observational designs, wherein one theorizes from effect to potential cause. Without better theory, researchers are prone to mismeasure— or rely on mismeasured— constructs and search for any relationship across an infinite array of variables or constructs. This practice only increases data complexity. Why is elaborate theory needed? Because, as noted by the statistician R.A. Fisher, in the absence of randomization one should entertain as many explanations for an effect as possible and conduct analyses to uphold or dismiss each one.8,9 Finding a significant effect is merely the first step. Demonstrating that no other explanation can account for the effect is where the real work and benefit lie. The fact that it is typically impossible to rule out all competing explanations in an observational design must not inhibit efforts toward this end. Understanding will advance by clear articulation and empirical demonstration of competing explanations. Such results are the grist for tighter theories and future research.
Perhaps the lack of theoretical model testing should not be surprising given the reward structures for (academic) research: No peer-review demand exists for doing so, and in food- and physical activity– environment research, the risk of being discounted is great due to its transdisciplinary nature. An acceptable conceptual model to an economist is different from what is acceptable to a nutritionist. This makes conceptual advancement very difficult.
Challenge #2: Study Designs Different study designs require different data-collection efforts. For example, a simple RCT may, in the main, require only the recording of assigned treatment condition and measurement of the outcome variable. Such data sets are simple in structure, perhaps even containing just two variables, but extremely useful. In contrast, a cross-sectional design requires a complex data set so that analysts can consider confounding, endogeneity, selection, and other threats to validity across different levels (e.g., person and place). Research designs are thus inextricably related to data complexity. Cross-sectional designs are used predominantly to examine the influence of the food and physical activity environments. Although the limitations of such designs are well known, it is often assumed that prospective, naturalistic, or longitudinal designs will yield less-complicated data and thus stronger inferences. Although longitudinal designs are helpful, it is important to recognize that they also are within the class of observational (as opposed to experimental) designs, and so data resources will remain complex and inferences suspect. This is because the food and physical activity environments (e.g., sidewalks or fast-food restaurants) is partially endogenous (i.e., built by people), which means the commonly understood advantage of longitudinal designs often associated with infectious disease research (i.e., random infections) is reduced. The upshot is that it remains difficult for many food- and physical activity–focused longitudinal designs to disentangle temporality, which is the principal advantage of the design. Overall, complicating factors such as endogeneity, selection bias, and time-dependent confounding will remain threats to longitudinal food- and physical activity– environment studies.3,10 From a data-complexity perspective, qualitative and/or case-study investigations yield more-complicated and nuanced data than do quantitative studies. But when properly employed, such designs can improve the conceptual understanding and thus lead to better understanding of data employed in typical quantitative investigations.11,12 The apparent reluctance to conduct and appreciate qualitative and case-study investigations is troubling. Few seem interested in conducting or supporting “shoe-leather” research wherein researchers actually go into the field and observe phenomena and
S178 American Journal of Preventive Medicine, Volume 36, Number 4S
www.ajpm-online.net
collect rich data relevant for explanation and perhaps intervention. In this way, and somewhat ironically, the de facto rejection of more complex case-study designs is resulting in more data complexity for conventional studies.
Challenge #3: Analytic Techniques It is true that high-dimension data sets can be reduced using statistical techniques, such as factor and cluster analysis, but without a strong conceptual rationale for measuring and including constituent variables and observations, analysts may not gain much clarity at all. Currently, it appears that analysts are overly confident in the ability of statistical procedures to solve complex data problems.3,6,9,11,13–16 Additionally, there is a troubling level of over-confidence in regression-adjustment procedures for mitigating the ubiquitous effects of confounding in observational designs. Few seem to appreciate that the utility of results derived from regression adjustment are loaded with often untestable assumptions, principally exchangeability, linearity, and other ceteris paribus assumptions.11,17 The problem lies not with making assumptions—we must do so— but with not recognizing and conveying the implications of doing so to research consumers. Overall, the distinction between statistical inference, which seeks to tie sample statistics to unknown population parameters, and scientific explanation, whose main goal is to offer a defensible account of the phenomena under investigation, remains misunderstood by too many.18
Challenge #4: Measurement Error Concerns Measurement error is a ubiquitous phenomenon that complicates food- and physical activity– environment data and analysis. We consider two related issues: scale development and the adoption of measurement error models. First, it appears that many researchers are uncertain about how to properly (i.e., rigorously) develop and assess measures and, where appropriate, create scales and indices tapping latent variables for subsequent analytic inquiry. This is understandable as the move from conventional psychometric research into ecometric (not “econometric”) research, where scales and indices tap latent ecologiclevel constructs, is in its infancy. But the evolution seems necessary.19,20 Further, the focus of recent measurementerror studies appears to be almost exclusively on reliability assessment, with little attention to its precursor—validity assessment. Both kinds of assessment are clearly needed. In addition, variation in the reliability and validity of scales across local contexts merits closer scrutiny. Second, food- and physical activity– environment research appears to be paying too little attention to the existing literature on measurement and misclassification error. Because perfect measurement is impossible, it is important to recognize that calls for improving April 2009
measures have thus far not addressed the fundamental question of which level of precision is sufficient to advance the area of research. A related question is how acceptable thresholds should differ by class of measure, outcome measure, primary exposures, and confounder variables. What is more, insufficient attention has been paid to the impact of measurement errors on standard error estimates and bias. Fairly straightforward models and tools are available to analysts concerned about the impact of measurement error,9,21–23 but seemingly few are used.
Recommendations This section summarizes our recommendations for overcoming data complexity in food- and physical activity– environment research. The aim is to offer practicable, usable advice to investigators aiming to advance the literature. Each recommendation contains several sub- or related recommendations. Again, we stress “preliminary” or “upstream” recommendations over specific statistical data-reduction techniques because it seems such techniques are secondary.
Think More As mentioned above, a key obstacle to advancing foodand physical activity– environment research is the lack of falsifiable constructs or theory for explaining the phenomena under investigation. Development of such constructs requires careful thought and cannot, in principle, be done through empirical research alone. Researchers should spend a substantial amount of time thinking through a given study, from data collection to analysis to conclusions to limitations. Doing so will likely yield explicit models and sharper hypotheses, and the progress of food- and physical activity– environment research will increase. Part of the thinking stage obviously includes consideration of past work. In the case of food- and physical activity– environment research, increased attention to the growing literature on challenges of “contextual” effects and measurement error would be beneficial. Issues of multilevel exposures, selection, and troubling dynamic feedback loops in which outcomes affect exposures must be considered.24 The statistician George Box noted that we should measure our models not by how perfectly they represent reality but rather by their utility for helping analysts understand the same. It is important to remember that he also insisted that scientific progress required an iterative process between conceptual thinking and empirical investigation.25,26 Attention to the former process is now in order and should mitigate some confusion associated with complex data.
Imagine Experiments Food- and physical activity– environment research presents few opportunities to conduct rigorous controlled Am J Prev Med 2009;36(4S)
S179
experimental trials. Most researchers are more or less forced to rely on observational designs, such as crosssectional surveys or cohort studies. When both designing and analyzing observational data, however, it is best to imagine what the desired experiment would be for a given research question, and then work mentally backward from there to the possible design and analysis that can be practically conducted. The benefits of imagining desired but impracticable (hypothetical) experiment and working backward are profound.9,27 The effort highlights limitations (e.g., confounding) of any given analysis and forces researchers to consider more seriously the estimation of causal effects, as opposed to associations and other inferior parameters as a goal. Finally, thinking experimentally may help food- and physical activity– environment researchers seek out natural experiments and other opportunities that free measures from troubling endogenous forces.28 Overall, the imagined-experiment approach to analysis should help researchers overcome their mass of complex data by forcing them to consider only sharp, if not narrow, hypotheses. The approach should also promote the planning of observational studies and consideration of necessary (environmental) variability for sound inference. Finally, the experimental-thinking approach tends to highlight the validity of competing explanations, and thereby helps move scientific research forward.
Cross-Validate Investigators should not perform data-reduction techniques (e.g., factor or cluster analysis) in a given data set and use both the same data and resultant factors in a subsequent regression or related analysis. The reason for this lies with the foundations of (frequentist) statistical inference. Using data-reduction techniques is perfectly acceptable, but using the results of such techniques for analysis with the same data capitalizes on the chance relationships in the given data set.29 The consequence is an increase in Type-I errors, that is, p-values that are artificially too small. Ideally, one should conduct data reduction in one data set and apply those constructs for analysis in another, independently sampled, data set.30 Other alternatives are data-splitting techniques, which are effectively the same thing.31 An analysis that does, in fact, perform data reduction in one data set and use the same data and results for analytic purposes must be viewed as exploratory or preliminary. Although such work has many benefits, interpretive caution is paramount. Relatedly, there is much to gain from applying a (regression) model fit in one data set to the data in another. This process is the acid test of model fit and much can be learned from dispassionate assessment. It follows that it is critical for researchers to clearly discuss and present their models so that others can take them seriously, and test the very same model in their own
data. Such work should help overcome problems associated with complex data.
Exploit Existing Measurement Error Research Discussions about measurement error problems in food- and physical activity– environment research are often limited to discussions of improving measurement tools (e.g., accelerometers, diaries). Although useful, such solutions fail to recognize important advances in measurement-error models and the utility of such approaches to sound inference.32 Food and physical activity environment research would likely benefit from insights into the expected impacts on inference of measurement error in outcome, exposure, and/or potential confounding variables. Among others, a key point is that nondifferential measurement error is typically far less threatening than the vastly more complicated and troublesome differential measurement error. Adoption of contemporary sensitivity analysis techniques should also help clarify the level of concern researchers should have about measurement error.22
Increase Support for Methodologic Work More attention and resources must be devoted to several especially troubling methodologic obstacles in food- and physical activity– environment research. We must better understand the problems of exchangeability, selection, multilevel dynamics, interference, and the feasibility or practicability of proposed food and/or physical activity policy changes. Despite their appearance in peer-reviewed journals, recent efforts to collect cross-sectional data, apply a multilevel regression model, and report an “association” will not help much. Although granting agencies—such as NIH, RWJF, and the National Science Foundation—should insist on and provide better support for methodologic research and training, foodand physical activity– environment researchers must do their part by holding each other’s work to higher methodologic standards. Methodologic research may be advanced in so-called primary methods studies (i.e., solely methodologic projects) or secondary (i.e., add-on) methods investigations, especially as associated with experimental investigations. The results of more tailored methodologic work and higher methodologic standards, beginning with the design of a study, should help mitigate problems associated with complex food- and physical activity– environment data.
Conclusion Food- and physical activity– environment research often yields data that are complex and thus difficult to analyze and interpret. Investigators may wish to employ sophisticated statistical/analytic tools to simplify things. In concert with other conference participants, we see the problem, and thus the solution, differently. As
S180 American Journal of Preventive Medicine, Volume 36, Number 4S
www.ajpm-online.net
opposed to factor analytic or related data-mining techniques, we maintain that careful and experimental thinking, cross-validation, and a richer understanding of measurement error are the best path forward. Toward this end, qualitative case-study research and increased support for methodologic investigation appear necessary. No financial disclosures were reported by the authors of this paper.
References 1. McKinnon RA, Reedy J, Handy SL, Rodgers AB. Measuring the food and physical activity environments: shaping the research agenda. Am J Prev Med 2009;36(4S):S81–S85. 2. Leamer E. Specification searches. New York: Wiley, 1978. 3. Manski CF. Identification problems in the social sciences. Cambridge MA: Harvard University Press, 1995. 4. Robins JM, Greenland S. The role of model selection in causal inference from no experimental data. Am J Epidemiol 1986;123:392– 402. 5. Auchincloss AH, Roux AVD. A new tool for epidemiology: the usefulness of dynamic-agent models in understanding place effects on health. Am J Epidemiol 2008;168:1– 8. 6. Oakes JM. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med 2004;58:1929 –52. 7. Cummins S, Curtis S, Roux AVD, Macintyre S. Understanding and representing ‘place’ in health research: a relational approach. Soc Sci Med 2007;65:1825–38. 8. Cochran WG. The planning of observational studies of human populations. J R Stat Soc [Ser A] 1965;128:243– 65. 9. Rosenbaum P. Observational studies. New York: Springer Verlag, 2002. 10. Durlauf SN. Neighborhood effects. In: Henderson JV, Thisse J-F, eds. Handbook of regional and urban economics. Amsterdam: North Holland, 2004:2173–2242. 11. Berk R. Regression analysis. A constructive critique. Thousand Oaks CA: Sage Publications, 2004. 12. Freedman DA. On types of scientific inquiry: nine success stories in medical research. In: Box-Steffensmeier JM, Brady HE, Collier D, eds. The Oxford handbook of political methodology. New York: Oxford, 2008:300 –18.
April 2009
13. Freedman D. Statistical models and shoe leather (with discussion). In: Marsden PV, ed. Sociological methodology. San Francisco: Jossey-Banks, 1991:291–358. 14. Leamer EE. Let’s take the con out of econometrics. Am Econ Rev 1983;73:31– 43. 15. Oakes JM. Commentary: advancing neighbourhood-effects research— selection, inferential support, and structural confounding. Int J Epidemiol 2006;35:643–7. 16. Oakes JM, Forsyth A, Schmitz KH. The effects of neighborhood density and street connectivity on walking behavior: the Twin Cities Walking Study. Epidemiol Perspect Innov 2007;4:16. 17. Cole SR, Hernan MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol 2008;168:656 – 64. 18. Greenland S. Summarization, smoothing, and inference in epidemiologic analysis. Scand J Soc Med 1993;21:227–32. 19. Messer LC. Invited commentary: beyond the metrics for measuring neighborhood effects. Am J Epidemiol 2007;165:868 –71; 872–3. 20. Raudenbush SW, Sampson RJ. Ecometrics: toward a science of assessing ecological settings, with application to the systematic social observation of neighborhoods. In: Sobel ME, Becker MP, eds. Sociological methodology. Boston MA: Blackwell, 1999:1– 41. 21. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement error in nonlinear models: a modern perspective. 2nd ed. New York: Chapman & Hall/CRC, 2006. 22. Gustafson P, Greenland S. The performance of random coefficient regression in accounting for residual confounding. Biometrics 2006;62:760 – 8. 23. Bound J, Brown C, Mathiowetz N. Measurement error in survey data. In: Heckman JJ, Leamer E, eds. Handbook of econometrics, Vol 5. New York: North-Holland, 2001:3705– 843. 24. Oakes JM. Invited commentary: rescuing Robinson Crusoe. Am J Epidemiol 2008;168:9 –12. 25. Box GEP. Science and statistics. J Am Stat Assoc 1976;71:791–9. 26. Box GEP, Draper NR. Empirical model-building and response surfaces. New York: Wiley, 1987. 27. Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology 2001;12:313–20. 28. Glymour MM. Natural experiments and instrumental variable analyses in social epidemiology. In: Oakes JM, Kaufman JS, eds. Methods in social epidemiology. San Francisco: Jossey-Bass/Wiley, 2006:423– 45. 29. Austin PC, Mamdani MM, Juurlink DN, Hux JE. Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health. J Clin Epidemiol 2006;59:964 –9. 30. Browne MW. Cross-validation methods. J Math Psychol 2000;44:108 –32. 31. Picard RR, and K. Berk. Data splitting. American Statistician 1990;44:140 –7. 32. Cole SR, Chu H, Greenland S. Multiple-imputation for measurement-error correction. Int J Epidemiol 2006;35:1074 – 81.
Am J Prev Med 2009;36(4S)
S181