Validity and reliability of a video questionnaire to assess physical function in older adults

Validity and reliability of a video questionnaire to assess physical function in older adults

Experimental Gerontology 81 (2016) 76–82 Contents lists available at ScienceDirect Experimental Gerontology journal homepage: www.elsevier.com/locat...

539KB Sizes 2 Downloads 76 Views

Experimental Gerontology 81 (2016) 76–82

Contents lists available at ScienceDirect

Experimental Gerontology journal homepage: www.elsevier.com/locate/expgero

Validity and reliability of a video questionnaire to assess physical function in older adults Anoop Balachandran a, Chelsea N.Verduin a, Melanie Potiaumpai a, Meng Ni a, Joseph F. Signorile a,b,⁎ a b

University of Miami, Laboratory of Neuromuscular Research and Active Aging, Department of Kinesiology and Sports Sciences, Coral Gables, FL, USA Miller School of Medicine, Center on Aging, University of Miami, Miami, FL, USA

a r t i c l e

i n f o

Article history: Received 2 December 2015 Received in revised form 26 April 2016 Accepted 28 April 2016 Available online 16 May 2016 Section Editor: Christiaan Leeuwenburgh Keywords: Physical function Video questionnaire Validity Reliability Elderly

a b s t r a c t Background: Self-report questionnaires are widely used to assess physical function in older adults. However, they often lack a clear frame of reference and hence interpreting and rating task difficulty levels can be problematic for the responder. Consequently, the usefulness of traditional self-report questionnaires for assessing higher-level functioning is limited. Video-based questionnaires can overcome some of these limitations by offering a clear and objective visual reference for the performance level against which the subject is to compare his or her perceived capacity. Hence the purpose of the study was to develop and validate a novel, video-based questionnaire to assess physical function in older adults independently living in the community. Methods: A total of 61 community-living adults, 60 years or older, were recruited. To examine validity, 35 of the subjects completed the video questionnaire, two types of physical performance tests: a test of instrumental activity of daily living (IADL) included in the Short Physical Functional Performance battery (PFP-10), and a composite of 3 performance tests (30 s chair stand, single-leg balance and usual gait speed). To ascertain reliability, two-week test-retest reliability was assessed in the remaining 26 subjects who did not participate in validity testing. Results: The video questionnaire showed a moderate correlation with the IADLs (Spearman rho = 0.64, p b 0.001; 95% CI (0.4, 0.8)), and a lower correlation with the composite score of physical performance tests (Spearman rho = 0.49, p b 0.01; 95% CI (0.18, 0.7)). The test-retest assessment yielded an intra-class correlation (ICC) of 0.87 (p b 0.001; 95% CI (0.70, 0.94)) and a Cronbach's alpha of 0.89 demonstrating good reliability and internal consistency. Conclusions: Our results show that the video questionnaire developed to evaluate physical function in community-living older adults is a valid and reliable assessment tool; however, further validation is needed for definitive conclusions. © 2016 Elsevier Inc. All rights reserved.

1. Introduction Persons 65 years or older represent the fastest growing age group in the United States (Werner, 2010). It is estimated that by 2050 there will be 83.7 million older adults, almost double the estimated population of 43.1 million in 2012 (Vincent and Velkoff, 2010). It is well-established that aging results in a progressive decline in skeletal muscle mass and strength (Frontera et al., 2008; Gallagher et al., 2000). This age-related loss of muscle and strength can gradually lead to the loss of physical independence, increased fall probability, and reduced quality of life in older persons (Janssen et al., 2004; Lord et al., 1994).Maintaining function and independence is equally as important as prolonging life expectancy in older adults (Katz et al., 1983). Considering the expected rise in elderly population, preserving or maintaining physical function poses a ⁎ Corresponding author at: Department of Kinesiology and Sport Sciences, University of Miami, 1507 Levante Ave, Max Orovitz, Rm 114, Coral Gables, FL 33146, USA. E-mail address: [email protected] (J.F. Signorile).

http://dx.doi.org/10.1016/j.exger.2016.04.022 0531-5565/© 2016 Elsevier Inc. All rights reserved.

significant public health concern. In recognition of the problem, one of the objectives proposed by the United States Department of Health And Human Service's Healthy People 2020 initiative is to reduce the proportion of older adults with moderate to severe functional limitations (United States Department of Health and Human Services (HHS), Healthy People 2020. Washington, DC.). Therefore, developing valid and practical instruments to assess physical function in the clinic, or to evaluate the effectiveness of interventions designed to improve function, is a critically important topic in the field of aging. Assessment of physical function employs two basic approaches: selfreport (or proxy) and performance measures (Guralnik et al., 1994; Haywood et al., 2006). Both are complementary, yet distinct, constructs and have inherent advantages and disadvantages (Hoeymans et al., 1996). Unlike self-reports, performance measures are less influenced by ceiling effects, cognitive impairments, education, culture, and language (Guralnik et al., 1989). Hence performance measures are often preferred, whether separately or in concert with self-reported measure (Brach et al., 2002; Guralnik et al., 1994; Kivinen et al., 1998). Although a

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

limited number of studies in certain sub-groups have shown self-report to be as sensitive as performance based tests, performance tests are generally regarded to be more sensitive to change (Fried et al., 2000; Fritz and Piva, 2003). Although the benefits of performance testing are clearly defined, their use when assessing physical function is limited due to feasibility issues including space requirements, special equipment, trained personnel, physical and temporal burdens placed upon the subject, injury potential, the willingness of the subject to exert necessary effort, and the assessment time and cost to the testing facility. Owing to these drawbacks, self-reported measures are widely used to evaluate physical function in clinical settings and large scale studies. One of the major limitations of self-reported measures, however, is the lack of a strict definition or frame of reference to the questions. For example, a self-reported assessment of stair climbing could be interpreted differently by people with similar functional and cognitive abilities. The responses could differ based on the perceived architectural characteristics of the stairs to which they are accustomed, such as the inclination, height and depth of the steps, presence or absence of a hand rail, and the number of steps. In addition, without a clear objective reference for self-comparison, interpretation of difficulty levels tend to be problematic during self-reports. Hence, most traditional self-report measures are limited to measuring inability, need for assistance or difficulty when performing a task, rather than higher level of functioning in independently-living individuals (Guralnik and Simonsick, 1993). Video questionnaires, which provide a clear frame of reference, can overcome some of the drawbacks inherent in self-reported questionnaires of physical function. Also, since there is an objective reference against which questions can be prepared, subjects could have fewer problems interpreting the difficulty levels. For instance, a video clip of a person climbing stairs can clearly show the task required and also the architectural characteristics of the stairs. Also, how fast the person in the video climbs the stairs can serve as a consistent visual reference for the performance level against which the subject is to compare his or her perceived capacity. Further, given the rapid advances in the field of tablets - handheld touch-screen computers with wireless connectivity - the delivery, usability, and cost of video questionnaires poses little problem. Considering the availability of the technology, the advantages of video presentation for selfreporting, and the lack of studies looking at the effectiveness of video questionnaires to assess function, this constitutes a promising area for aging research. To our knowledge, only two video questionnaires have been investigated: the Animated Activated Questionnaire (AAQ) for assessing basic activity of daily living (BADL's) in patients with hip/knee osteoarthritis (OA) and the Mobility Assessment Tool (MAT-sf) to assess mobility limitations in a moderately functional elderly cohort (Rejeski et al., 2010). Unlike mobility, instrumental activities of daily living (IADLs) constitute complex tasks necessary for functioning independently in the community such as doing laundry, carrying groceries, climbing stairs, and dressing (Lawton and Brody, 1969). Considering that the IADLs involve complex and strenuous physical tasks, they usually demonstrate substantial declines before decreases in BADLs become evident. Therefore, assessing IADLs can help identify individuals who are at high risk of disability, but who otherwise appear independent and report little disability. Also, since IADLs directly reflect tasks associated with daily living, they may better assess disability than physical performance tests. On the other hand, physical performance tests such as the chair stand, single-leg balance, and gait speed, reflect measures closer to the concept of impairment or functional limitation (Hoeymans et al., 1996). These considerations should be taken into account when constructing assessment tools designed to allow targeted activity-based interventions, since individuals' exercise responses are specific to the nature of the overload, and persons having higher functional status or moderate frailty tend to benefit the most from these interventions (Gill et al., 2002).

77

In light of the importance of assessing physical functioning in older adults and the benefits of using a video questionnaire as a mode of self-reporting, we propose a new video assessment tool to evaluate IADLs in independently living older individuals. Therefore, the aim of the current study was to develop and validate a computer-administered video questionnaire to assess IADLs in healthy, older adults. Our primary hypothesis was that the video questionnaire would exhibit a moderate correlation with the actual performance of the IADLs. Additionally, we hypothesized that the video questionnaire would have lower correlations with physical performance tests which measure impairment rather than activities of daily living (IADLs).

2. Material and methods 2.1. Item selection and questionnaire development The questionnaire was designed to evaluate IADL performance in community-dwelling older adults. For item development, we used items from a pre-existing physical function performance measure, the Physical Functional Performance 10 Test (PFP-10). The PFP-10 has been shown to be valid, reliable and sensitive to change in ambulatory adults over 60 years of age (Cress et al., 1996, 1999, 2005). The PFP-10 measures the ability to perform 10 IADLs making it a measure of daily function rather than physical performance. Also, the PFP-10 was specifically designed to minimize ceiling or basement effects and uses several physical domains (upper or lower body strength, flexibility, balance, coordination and endurance). Items in the PFP-10 were selected based on theory, expert opinion, focus groups, and feasibility criteria (Cress et al., 2005). Ten videos of IADLs included in the PFP-10 were recorded. To minimize any ceiling effect, the videos were recorded using young healthy adults, both male and female, who were instructed to perform the task as fast as possible. The videos were edited and rendered using a video editing software package (Adobe Premiere PrO CS5, San Jose). The videos were then interfaced using an online questionnaire (Qualtrics, Provo, UT). The participant could repeat the videos as many times as they wished, responding by moving a slider along a visual analog scale ranging from 0–100 so they could edit answers as they felt appropriate. All questions had to be answered to complete the questionnaire successfully. A snapshot of the video questionnaire with the visual scale and navigation elements is shown in Fig. 2. All video items were accompanied by a task question. The questions were framed in the present tense using a capability (can do) stem rather than a performance stem (does do), like “can you do the task” as opposed to “are you able to”. For example, the floor rise video was accompanied by the question, “can you sit down and get up from the floor as fast as the person in the video?” Additionally, rather than using a typical 4 or 5 point rating scale, a visual analog scale ranging from 0–100 was used. The visual analog scale was divided into three cut-off points: cannot do the task (0), can do the task as fast the person in the video (50), can do faster than the person in the video (100). The questionnaire was scored by simply adding the individual item scores. The difficulty in performing the task was assessed by the individual's perception of the time he or she would require to complete a task relative to the performer in the video. Therefore, people who considered themselves to have the greatest difficulty in performing tasks, and therefore requiring the longest perceived time to complete that task, were expected to choose lower scores, while those with lower perceived time requirements and difficulty levels should choose a higher score. The clarity, comprehension, and usability of the scale were assessed qualitatively using 15 subjects similar in age and function. Revisions were made to the online survey based on their feedback. For example, font size, video size, visual slider length and navigation elements were enlarged to enhance usability.

78

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

Fig. 1. Participant flow for the validity and reliability component of the study.

Fig. 2. An example of a screen shot from the video questionnaire.

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

2.2. Performance tests Performance tasks were of two types: PFP-10 and a composite score of 30 s chair stand, single leg balance and usual gait speed. PFP-10 measures IADLs whereas the composite score measures physical performance. As stated previously, IADLs are closer to concept of disability whereas performance tests measure functional impairments. The composite score of the 3 tests was named as physical performance test. 2.2.1. PFP-10 The PFP-10 incorporates ten IADL tasks that involve one or two of the physical function domains (upper or lower body strength, flexibility, balance, coordination and endurance). These tasks were quantified using time, weight or both. The ten tasks included, sit and stand up from the floor, transferring laundry from washer to dryer and back to a basket, carrying groceries, putting on and removing a jacket, picking up four scarves from the floor, climbing a flight of stairs, removing a sponge from an adjustable overhead platform, carrying a weighted pan, a 6-minute walk, and floor sweeping with a broom and dust pan. The details of the tests are given elsewhere (Cress et al., 1996). The total score of all ten tests was calculated by converting the individual test scores to Fisher's z-scores and totaling the individual scores. The z-scores were calculated by dividing the difference of each subject's test score from the test score mean by the cohort's standard deviation. Tasks that participants could not perform were scored by adding the standard deviation to the slowest time obtained for that specific task. 2.2.2. Physical performance test Similar to the Short Physical Performance Battery (SPPB), a 30-s repeated chair stand, 10 m usual gait speed and single-leg balance were assessed as measures of lower body physical performance, namely physical performance test. A composite score of the three performance tests was calculated by converting the individual scores into Fisher's zscores and computing a single total score. The Short Physical Performance Battery (SPPB) was not used since it is only valid in adults over 70 years of age (Guralnik et al., 1994). 2.3. Participants Participants were recruited from the local community using flyers, posters, and advertisements in newspapers. We also used an internal database that contained names of older adults interested in participating in research. The eligibility criteria for inclusion were: being between 60 and 90 years of age and living independently in the community. Exclusion criteria included neurological impairment that would affect balance, inability to speak or read English, severe cognitive impairment, needing assistance on BADL, severe musculoskeletal impairment, unstable chronic disease state, major depression, severe vestibular problems, severe orthostatic hypotension, and simultaneous use of cardiovascular, psychotropic and antidepressant drugs. The protocol was approved by the University's Institutional Review Board for the Protection of Human subjects and all participants signed an approved informed consent. A chart showing the flow of subjects through the study is presented in Fig. 1. A total of 61 participants were recruited for the study. 35 participants completed the online questionnaire on a laboratory computer before completing any performance tests, including the PFP-10, 10 m gait speed, 30-s chair stand and single leg balance. The online questionnaire took approximately 5–10 min to complete; while the PFP-10 took 45–60 min. To reduce items, and thereby minimize redundancy, the items were rank-ordered based on the correlation of each video item with the total IADL score minus the correlated item. Then the first five most highly correlated items were selected. The five selected items were sitting and standing up from the floor, transferring laundry from washer to dryer and back to a basket, carrying groceries, carrying a

79

weighted pan, and floor sweeping with a broom and dust pan. The time taken for the five-item video questionnaire was less than 5 min. Later, test-retest reliability of the online video questionnaire (5 items) was analyzed using a sample of 26 participants on two different occasions separated by duration of 2 weeks. The participants recruited for reliability were not involved in the validity testing and were recruited following the validity testing. Also, the inclusion and exclusion criteria for the participants for the reliability and validity component were identical. The reliability testing participants had a time interval of 2 weeks, which is considered to be appropriate for patient-reported outcomes (D. L. Streiner et al., 2015). The test instructions and test administration remained the same for both tests. Also, the tests were administered in the same laboratory to maintain a consistent environment across both tests. 2.4. Statistical analysis Descriptive characteristics of the participants are provided in Table 1. As can be seen from the table, 35 subjects completed the validation component of the study. According to the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) guidelines, for validation purposes a sample of 30 is considered a ‘fair’ quality (Terwee et al., 2012). Spearman's correlation coefficients with 95% confidence intervals were used to examine relationships between the video questionnaire, PFP-10, and the physical performance test. The internal consistency of the video questionnaire was assessed using Cronbach's alpha. Test-retest reliability was calculated by means of an Intraclass Correlation Coefficient (ICC) using a two-way random effects model for absolute agreement. To assess agreement between the test days, a Bland-Altman plot was used. All significance tests were two-tailed and an alpha level of 0.05 was established a priori for significance. All statistical analyses were performed using the SPSS, version 21 statistical package (IBM SPSS Statistics, Armonk, NY). 3. Results The characteristics of the 35 participants who completed both performance tests and the questionnaire for validity, and the 26 participants who completed video questionnaire for reliability are shown in Table 1. The mean age of the participants for the validity component was 69 ± 6.8 years (mean ± SD), while the age for the reliability subgroup was 69 ± 3.4 years. Computer experience was assessed by the question “How would you rate yourself in using a computer?” The response options were beginner, average, and advanced. Of the 35 participants in the validity group, 73% of the participants rated themselves as ‘average’ (60.6%) or ‘beginners’ (12.1%) when using computers. Of the 26 participants in the reliability group, 86% rated themselves as ‘average’ (69.2%) or ‘beginners’ (15.4%) when using computers. There were Table 1 Descriptive characteristics of study participants. Characteristics

Validity (n = 35)

Reliability (n = 26)

Age, y Sex Weight, kg Height, m BMI, m/kg2

69 ± 7 7M/28F 76.0 ± 16.8 1.66 ± 0.09 28.0 ± 5.8

69 ± 3 11M/15F 74.0 ± 17.5 1.68 ± 0.09 25.0 ± 4.5

Computer experience Beginner, % Average, %

12.1 60.6

15.4 69.2

Physical performance test Single leg balance (s) 30s chair stand (#) 10 m gait speed (s)

19.2 ± 11.8 15 ± 4 7.6 ± 1.2

Notes: Values are mean ± SD unless noted otherwise. # = number of stands. No significant differences were seen between groups (p N 0.05).

80

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

no missing responses for either the validity or reliability components of the study since all questions had to be answered to successfully complete the video questionnaire. 3.1. Validity Table 2 shows Spearman's rank correlation coefficient between scores of the online video questionnaire, PFP-10 and physical performance composite scores. The Spearman correlation for the video questionnaire (5 items) compared to the PFP-10 showed a moderate correlation of 0.64 (p b 0.001; 95% CI (0.4, 0.8)). As a measure of construct validity, the correlation of video questionnaire to the physical performance test battery, which included single leg balance, gait speed and 30-s chair stand, was 0.49 (p b 0.01; 95% CI (0.18, 0.7)). The Spearman correlation comparing the PFP-10 to the physical performance test showed a strong correlation of 0.86 (p b 0.001, 95% CI (0.74, 0.92)). Fig. 3. Bland-Altman plot of the relationship between Test 1 and Test 2. The solid horizontal line represents the mean difference of the tests whereas the dotted line represents the 95% limits of agreement.

3.2. Reliability The test-retest reliability of the video questionnaire in a sub-group of 26 participants measured on two occasions, 2 weeks apart, showed an intra-class correlation (ICC) of 0.87 (p b 0.001; 95% CI (0.70, 0.94)). To assess agreement, a Bland-Altman plot illustrating the relation between test day 1 and test day 2 is shown in Fig. 3. The mean difference was 0.6 and the limits of agreement were 73.6 and −72.4. The Coefficient of Repeatability (CR) was calculated to be 73. Across the two sessions, the video questionnaire had a Cronbach's alpha of 0.89 showing good internal consistency. Once again, there were no missing items for the either questionnaires since all questions had to be answered to successfully complete the questionnaire. The number of subjects who achieved the lowest score was 5% (2 out of 35 subjects) as was the number who achieved the highest possible score (2 out of 35) for each item. The floor and ceiling effect for the total scale was 0. 4. Discussion The purpose of this study was to develop a novel, video-based questionnaire to assess physical function in older adults living independently in the community. The video questionnaire showed a moderate correlation of 0.64 with the PFP-10 (IADLs). Also, as hypothesized, the questionnaire showed a lower correlation of 0.49 with the physical performance test. Although studies using video questionnaires to assess physical function are few, the current results are consistent with other video-based assessment studies. The MAT-sf (Mobility Assessment Tool-short form) tool was developed to assess mobility in elderly. The bivariate correlations of the MAT-sf with the SPPB and 400-m walk test were 0.59 (p b 0.001) and 0.58 (p b 0.001), respectively (Rejeski et al., 2010). The recently developed virtual SPPB, which assesses balance, walking speed and chair stands showed a moderate correlation of 0.60 with actual SPPB scores (Marsh et al., 2015a). Similar to the MAT-sf, the MAT-W is another self-report video questionnaire to assess walking activity (Marsh et al., 2015a,b). MAT-W showed a less than moderate

correlation with usual (r = 0.36, p b 0.001) and fast walking speeds (r = 0.45, p b 0.001). Also, using a modified version of self-report physical activity questionnaire (CHAMPS) and an objective measure of physical activity using an accelerometer, a moderate to high correlation of 0.66 and 0.65 was obtained respectively. The Animated Activity Questionnaire (AAQ) for assessing activity limitations in patients with hip/ knee osteoarthritis (OA) showed a correlation of 0.62 with performance based tests (Peter et al., 2015b). As expected, the current video questionnaire had a lower correlation when compared to the composite score of physical performance tests, which included usual gait speed, balance and chair stands. Compared to tasks such as carrying groceries up the stairs or getting up from the floor, these performance tasks are less complex and demanding; and thus were hypothesized to have a lower correlation with the video questionnaire. This discordance seen between self-report and performance measures is partly due to the fact that self-report techniques measure a person's ‘perceived’ ability to do the task, while performance measures reflect the ‘actual’ ability to do the task (Glass, 1998). Other reasons could be difficulty in comprehending the question; a lack of task familiarity; and psychological factors such as social desirability bias; and, selfefficacy inherent in responding (Cress et al., 1995; Reuben et al., 2004; Stretton et al., 2006) (Reuben et al., 2004). However, it is important to recognize that subjective perceptions can affect an individual's choice of daily activities despite his or her ability to perform those tasks (McAuley, 1992; Rejeski et al., 1996). The items in the questionnaire were selected from a pre-existing physical functional performance measure that has been shown to be valid, reliable and sensitive to change in community-living older adults, namely PFP-10 (Cress et al., 2005). Item reduction yielded 5 items which covered complex tasks that included more than one physical subdomain. All the tasks included were IADLs; hence, the questionnaire is a measure of the ‘ability’ to perform instrumental activities of daily living in community-living individuals. MAT-sf was based on a subset of 79 mobility items using item reduction technique (IRT) (Rejeski et

Table 2 Spearman correlations and 95% confidence intervals among the video questionnaire, PFP-10 and performance test.

Video questionnaire PFP-10

Video questionnaire

PFP-10

Physical performance test

1.00

0.64 (0.4, 0.8) p b 0.001⁎

0.49 (0.18, 0.7) p b 0.01⁎ 0.86 (0.74.0.92) p b 0.001⁎ 1.00

1.00

Physical performance test PFP-10: 10-item IADLs. Physical performance test: 30s chair stand, gait speed and single leg balance (n = 35). ⁎ p b 0.05.

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

al., 2010), MAT-w was built on MAT-sf (Marsh et al., 2015a,b), and VSPPB items were taken from the SPPB (Marsh et al., 2015a).On the other hand, in the AAQ, items were selected using the Delphi method (Peter et al., 2015a,b). Generally, internal consistency between 0.70 and 0.95 is considered good (Terwee et al., 2007). In the present study, internal consistency measured by Cronbach's alpha was 0.89. This shows that the items in the scale are homogeneous or correlated and thus are measuring the same underlying concept; however, a value of Cronbach's alpha above 0.95 is indicative of item redundancy in one or more of the items. The test-retest reliability assessed in the 26 subjects measured by ICC showed a high reliability of 0.86.Typically, 0.70 is recommended as a minimum standard for reliability and above 0.8 is considered high (Nunnally and Bernstein, 1994). Other studies using video technology showed a high reliability N0.8 indicating that video technology can be reliably used to assess function in this population (Marsh et al., 2015a; Peter et al., 2015b; Rejeski et al., 2010). MAT-sf, MAT-w, and VSPPB used computer animations to create the figures, while motion capture was used to transform movements of a person into animations for the Animated Activity Questionnaire (AAQ). In contrast, the current study used videos of actual people performing the task. Currently, it is unclear if any one of these methods has an advantage over the other in terms of clarity and ease of interpretation. The number of subjects who achieved the lowest score was 5% (2 out of 35 subjects) as was the number who achieved the highest possible score (2 out of 35) for each item. The floor and ceiling effect for the total scale was 0.Typically, a ceiling or floor effect is considered to be present if N15% of the respondents achieved the lowest or the highest possible score, respectively (Terwee et al., 2007). Considering that the population was community-living older adults, the video questionnaire shows minimal ceiling and floor effects. The virtual SPPB showed a ceiling effect of N 25% for balance, gait speed and chair stand and 12% for the total score (Marsh et al., 2015a). The Mat-sf showed a ceiling effect of 1.5% (Rejeski et al., 2010). It should be noted that, in contrast to the current study, both studies selected subjects with low to moderate function. There was no mention of ceiling or floor effects for the other studies. Compared to other scales, one of the unique aspects of the video questionnaire is that it is designed to assess IADLs in community-living individuals. In contrast, the MATsf, VSPPB and AAQ were validated in populations with compromised physical function. The majority of the self-report measures are primarily designed to measure disability, or inability or difficulty in performing a task. The video questionnaire was specifically designed to measure the complete spectrum of function, both inability and ability, along a continuum using activities of daily living. This was accomplished by different strategies: (1) wording the questions using a capability stem (can do) rather than a performance stem (does do); (2) assessing the degree of difficulty by examining the individual's perception of the time he or she would require to complete a task relative to the performer in the video; and, (3) using response options scaling from 0–100 where 0 meant ‘cannot do the task’ and 100 meant ‘can do the task faster than the person in the video’. Although self-report and actual performance comparisons are lacking, it has been recently shown that the latent trait “able to” rather than “performance of the activity”: (1) better represents the physical function outcome; (2) has far fewer false positives; (3) is clearer and preferred by subjects; and, (4) is more easily translatable (Fries et al., 2006). Considering that IADLs are closer to the concept of disability, the video questionnaire can capture both extremes of function, namely disability and performance, along a single, continuous scale. There are number of limitations in this study. First, it has been previously documented that self-report measures are limited in assessing function in community-living individuals due to the ceiling effects inherent in these tests (Brach et al., 2002; Cress et al., 1996). Nevertheless, including a traditional self-report measure in the current study would have enabled a direct comparison with the video questionnaire. Second,

81

responsiveness of the scale or the ability to detect clinically meaningful change over time was not evaluated. Consequently, the clinical interpretation of the change in scores over time or in different sub-groups of people is unclear. We are currently employing a training study to assess the change score as a result of an exercise intervention. Lastly, the sample in the current study is predominantly females and was not sufficiently heterogeneous. Although a recent development, the concept of using video technology to assess physical function has considerable potential. By having a clear frame of reference, the potential ambiguity in interpreting specific task-oriented questions can be minimized. The problem of interpreting difficulty in self-reported questionnaires is also reduced. Further, the use of visuals can eliminate wordy and ambiguous questions, require less reading comprehension, and can be translated to other languages with limited loss of meaning and context. Nevertheless, as previously mentioned, psychological factors such as motivation, self-efficacy, and social desirability bias will have their influence on the respondent's judgement. 5. Conclusions The validity and reliability evidence, along with minimal ceiling effects, shows that this video questionnaire is a valid and reliable tool for evaluating physical function in older adults living independently; however, it is important to note that no individual study can conclusively ‘establish’ the validity and reliability of an instrument (Streiner and Kottner, 2014). The psychometric properties of an instrument are very much dependent on the population and the contextual factors; and hence are often susceptible to change depending on the population and context. Developing valid and practical instruments to assess physical function in the clinic, or to evaluate the effectiveness of interventions designed to improve function, is a critically important topic in the field of aging. The use of video questionnaires to assess physical function is a relatively unexplored area, and we hope the current study will help draw greater attention to the use of this novel technology to assess physical function. References Brach, J.S., VanSwearingen, J.M., Newman, A.B., Kriska, A.M., 2002. Identifying early decline of physical function in community-dwelling older women: performancebased and self-report measures. Phys. Ther. 82 (4), 320–328. Cress, M.E., Schechtman, K.B., Mulrow, C.D., Fiatarone, M.A., Gerety, M.B., Buchner, D.M., 1995. Relationship between physical performance and self-perceived physical function. J. Am. Geriatr. Soc. 43 (2), 93–101. Cress, M.E., Buchner, D.M., Questad, K.A., Esselman, P.C., deLateur, B.J., Schwartz, R.S., 1996. Continuous-scale physical functional performance in healthy older adults: a validation study. Arch. Phys. Med. Rehabil. 77 (12), 1243–1250 (doi:S00039993(96)90187-2 [pii]). Cress, M.E., Buchner, D.M., Questad, K.A., Esselman, P.C., deLateur, B.J., Schwartz, R.S., 1999. Exercise: effects on physical functional performance in independent older adults. J. Gerontol. A Biol. Sci. Med. Sci. 54 (5), M242–M248. Cress, M.E., Petrella, J.K., Moore, T.L., Schenkman, M.L., 2005. Continuous-scale physical functional performance test: validity, reliability, and sensitivity of data for the short version. Phys. Ther. 85 (4), 323–335. Fried, L.P., Bandeen-Roche, K., Chaves, P.H., Johnson, B.A., 2000. Preclinical mobility disability predicts incident mobility disability in older women. J. Gerontol. A Biol. Sci. Med. Sci. 55 (1), M43–M52. Fries, J.F., Bruce, B., Bjorner, J., Rose, M., 2006. More relevant, precise, and efficient items for assessment of physical function and disability: moving beyond the classic instruments. Ann. Rheum. Dis. 65 (Suppl. 3), 16–21 (iii. doi:65/suppl_3/iii16 [pii]). Fritz, J.M., Piva, S.R., 2003. Physical impairment index: reliability, validity, and responsiveness in patients with acute low back pain. Spine 28 (11), 1189–1194. http://dx.doi. org/10.1097/01.BRS.0000067270.50897.DB. Frontera, W.R., Reid, K.F., Phillips, E.M., Krivickas, L.S., Hughes, V.A., Roubenoff, R., Fielding, R.A., 2008. Muscle fiber size and function in elderly humans: a longitudinal study. J. Appl. Physiol. (Bethesda, Md.: 1985) 105 (2), 637–642. http://dx.doi.org/10.1152/ japplphysiol.90332.2008. Gallagher, D., Ruts, E., Visser, M., Heshka, S., Baumgartner, R.N., Wang, J.H., B., S., 2000. Weight stability masks sarcopenia in elderly men and women. Am. J. Physiol. Endocrinol. Metab. 279 (2), E366–E375. Gill, T.M., Baker, D.I., Gottschalk, M., Peduzzi, P.N., Allore, H., Byers, A., 2002. A program to prevent functional decline in physically frail, elderly persons who live at home. N. Engl. J. Med. 347 (14), 1068–1074. http://dx.doi.org/10.1056/NEJMoa020423.

82

A. Balachandran et al. / Experimental Gerontology 81 (2016) 76–82

Glass, T.A., 1998. Conjugating the “tenses” of function: discordance among hypothetical, experimental, and enacted function in older adults. The Gerontologist 38 (1), 101–112. Guralnik, J.M., Simonsick, E.M., 1993. Physical disability in older Americans. J. Gerontol. 48 (Spec No, 3-10). Guralnik, J.M., Branch, L.G., Cummings, S.R., Curb, J.D., 1989. Physical performance measures in aging research. J. Gerontol. 44 (5), M141–M146. Guralnik, J.M., Simonsick, E.M., Ferrucci, L., Glynn, R.J., Berkman, L.F., Blazer, D.G., ... Wallace, R.B., 1994. A short physical performance battery assessing lower extremity function: association with self-reported disability and prediction of mortality and nursing home admission. J. Gerontol. 49 (2), M85–M94. Haywood, K.L., Garratt, A.M., Fitzpatrick, R., 2006. Quality of life in older people: a structured review of self-assessed health instruments. Expert Rev. Pharmacoecon. Outcomes Res. 6 (2), 181–194. http://dx.doi.org/10.1586/14737167.6.2.181. Hoeymans, N., Feskens, E.J., van den Bos, G.A., Kromhout, D., 1996. Measuring functional status: cross-sectional and longitudinal associations between performance and selfreport (Zutphen elderly study 1990–1993). J. Clin. Epidemiol. 49 (10), 1103–1110 (doi:0895-4356(96)00210-7 [pii]). Janssen, I., Baumgartner, R.N., Ross, R., Rosenberg, I.H., Roubenoff, R., 2004. Skeletal muscle cutpoints associated with elevated physical disability risk in older men and women. Am. J. Epidemiol. 159 (4), 413–421. Katz, S., Branch, L.G., Branson, M.H., Papsidero, J.A., Beck, J.C., Greer, D.S., 1983. Active life expectancy. N. Engl. J. Med. 309 (20), 1218–1224. http://dx.doi.org/10.1056/ NEJM198311173092005. Kivinen, P., Sulkava, R., Halonen, P., Nissinen, A., 1998. Self-reported and performancebased functional status and associated factors among elderly men: the Finnish cohorts of the seven countries study. J. Clin. Epidemiol. 51 (12), 1243–1252 (doi: S0895435698001152 [pii]). Lawton, M.P., Brody, E.M., 1969. Assessment of older people: self-maintaining and instrumental activities of daily living. The Gerontologist 9 (3), 179–186. Lord, S.R., Ward, J.A., Williams, P., Anstey, K.J., 1994. Physiological factors associated with falls in older community-dwelling women. J. Am. Geriatr. Soc. 42 (10), 1110–1117. Marsh, A.P., Wrights, A.P., Haakonssen, E.H., Dobrosielski, M.A., Chmelo, E.A., Barnard, R.T., ... Rejeski, W.J., 2015a. The virtual short physical performance battery. J. Gerontol. A Biol. Sci. Med. Sci. 70 (10), 1233–1241. http://dx.doi.org/10.1093/gerona/glv029. Marsh, A.P., Janssen, J.A., Ip, E.H., Barnard, R.T., Ambrosius, W.T., Brubaker, P.R., ... Rejeski, W.J., 2015b. Assessing walking activity in older adults: development and validation of a novel computer-animated assessment tool. J. Gerontol. A Biol. Sci. Med. Sci. 70 (12), 1555–1561. http://dx.doi.org/10.1093/gerona/glv101. McAuley, E., 1992. The role of efficacy cognitions in the prediction of exercise behavior in middle-aged adults. J. Behav. Med. 15 (1), 65–88. Nunnally, J.C., Bernstein, I.H., 1994. Psychometric Theory. third ed. McGraw-Hill, New York.

Peter, W.F., Loos, M., de Vet, H.C., Boers, M., Harlaar, J., Roorda, L.D., Terwee, C.B., 2015a. Development and preliminary testing of a computerized animated activity questionnaire in patients with hip and knee osteoarthritis. Arthritis Care Res. 67 (1), 32–39. http://dx.doi.org/10.1002/acr.22386. Peter, W.F., Loos, M., van den Hoek, J., Terwee, C.B., 2015b. Validation of the animated activity questionnaire (AAQ) for patients with hip and knee osteoarthritis: comparison to home-recorded videos. Rheumatol. Int. 35 (8), 1399–1408. http://dx.doi.org/10. 1007/s00296-015-3230-4. Rejeski, W.J., Craven, T., Ettinger Jr., W.H., McFarlane, M., Shumaker, S., 1996. Self-efficacy and pain in disability with osteoarthritis of the knee. J. Gerontol. B Psychol. Sci. Soc. Sci. 51 (1), P24–P29. Rejeski, W.J., Ip, E.H., Marsh, A.P., Barnard, R.T., 2010. Development and validation of a video-animated tool for assessing mobility. J. Gerontol. A Biol. Sci. Med. Sci. 65 (6), 664–671. http://dx.doi.org/10.1093/gerona/glq055. Reuben, D.B., Seeman, T.E., Keeler, E., Hayes, R.P., Bowman, L., Sewall, A., Guralnik, J.M., 2004. Refining the categorization of physical functional status: the added value of combining self-reported and performance-based measures. J. Gerontol. A Biol. Sci. Med. Sci. 59 (10), 1056–1061 (doi:59/10/M1056 [pii]). Streiner, D.L., Kottner, J., 2014. Recommendations for reporting the results of studies of instrument and scale development and testing. J. Adv. Nurs. 70 (9), 1970–1979. http:// dx.doi.org/10.1111/jan.12402. Streiner, D.L., Norman, G., Cairney, J., 2015. Health measurement scales. A Practical Guide to Their Development and Use, fifth ed. Oxford University Press. Stretton, C.M., Latham, N.K., Carter, K.N., Lee, A.C., Anderson, C.S., 2006. Determinants of physical health in frail older people: the importance of self-efficacy. Clin. Rehabil. 20 (4), 357–366. Terwee, C.B., Bot, S.D., de Boer, M.R., van der Windt, D.A., Knol, D.L., Dekker, J., ... de Vet, H.C., 2007. Quality criteria were proposed for measurement properties of health status questionnaires. J. Clin. Epidemiol. 60 (1), 34–42 (doi:S0895-4356(06)00174-0 [pii]). Terwee, C.B., Mokkink, L.B., Knol, D.L., Ostelo, R.W., Bouter, L.M., de Vet, H.C., 2012. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual. Life Res. Int. J. Qual. Life Asp. Treat. Care Rehab. 21 (4), 651–657. http://dx.doi.org/10.1007/s11136-011-9960-1. United States Department of Health and Human Services (HHS), Healthy People 2020. Washington, DC, ). Retrieved from http://www.healthypeople.gov/2020/ topicsobjectives2020/overview.aspx?topicid=31. Vincent, G.K., Velkoff, V.A., 2010. The Next Four Decades, the Older Population in the United States: 2010 to 2050. U.S. Census Bureau, Washington, DC. Werner, C., 2010. The older population Retrieved from http://www.census.gov/prod/ cen2010/briefs/c2010br-09.pdf.