Linking primary study data with administrative and claims data in a German cohort study on work, age, health and work participation: is there a consent bias?

Linking primary study data with administrative and claims data in a German cohort study on work, age, health and work participation: is there a consent bias?

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6 Available online at www.sciencedirect.com Public Health journal homepage: www.elsevier.com/puhe Or...

484KB Sizes 0 Downloads 30 Views

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

Available online at www.sciencedirect.com

Public Health journal homepage: www.elsevier.com/puhe

Original Research

Linking primary study data with administrative and claims data in a German cohort study on work, age, health and work participation: is there a consent bias? C. Stallmann*, E. Swart a, B.-P. Robra b, S. March c Institute of Social Medicine and Health Economics, Faculty of Medicine, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany

article info

abstract

Article history:

Objectives: We analysed the degree and impact of consent bias in the prospective study

Received 16 November 2016

‘leben in der Arbeit (lidA)’ after linking primary interview data with claims data from

Received in revised form

German statutory health insurance funds as well as with administrative data provided by

16 February 2017

the German Federal Employment Agency.

Accepted 1 May 2017

Study design: Prospective cohort study. Methods: Within two study waves (2011, 2014) primary data were collected based on computer-assisted personal interviews. During interview informed consent to data linkage

Keywords:

was obtained. We used binary logistic regression analyses with participants' consent for

Claims data

record linkage as the dependent variable calculating odds ratios (ORs) and 95% confidence

Consent bias

intervals (95% CIs) for independent variables. Several sociodemographic, socio-economic

Informed consent

and work-related factors were modelled as potential determinants of consent.

Record linkage

Results: A total of 4244 participants took part in both waves. After excluding invalid con-

Primary data

sent, 4178 participants were included in the analysis. About 3918 (93.8%) of these participants gave their consent to link their primary data with data from at least one source. Within regression analyses only moderate bias was found due to region of residence, apprenticeship, professional affiliations, income and number of diseases. Participants from former West Germany were less likely to have their study data linked with both data sources (OR 0.63 [95% CI 0.42e0.96]) than those from the former East Germany. Participants with no information on income were more likely to refuse consent to both data sources compared to the reference group (net income: under EUR 1000; OR 0.15 [95% CI 0.08e0.30]). Respondents with two (OR 1.37 [95% CI 1.06e1.77]) or three and more diseases (OR 1.30 [95% CI 1.02e1.66]) diagnosed by a doctor agreed more frequently to linking both data sources than participants without disease. There is just a small proportion of variance in

* Corresponding author. Institute of Social Medicine and Health Economics, Faculty of Medicine, Otto-von-Guericke-University, Leipziger Strasse 44, D-39120, Magdeburg, Germany. Tel.: þ49 391 67 24 321. E-mail addresses: [email protected] (C. Stallmann), [email protected] (E. Swart), bernt-peter.robra@med. ovgu.de (B.-P. Robra), [email protected] (S. March). a Tel.: þ 49 391 67 24 306. b Tel.: þ 49 391 67 24 300. c Tel.: þ 49 391 67 24 323. http://dx.doi.org/10.1016/j.puhe.2017.05.001 0033-3506/© 2017 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

10

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

consenting explained by the models (R2: 0.063e0.085). Also, only small changes of factors' prevalence were observed in consenters. Conclusions: For the first time in Germany, the lidA-study links primary survey data with health claims and administrative employment data. We conclude that there is only a minor relation between the analysed factors and consent behaviour of the participants. A linked data set may be used in further analyses without substantial biases. © 2017 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

Introduction The option of linking primary data with information from existing data sources within the scope of epidemiological studies is used increasingly in Germany. For example, within the research framework of KORA (Cooperative Health Research in the Region Augsburg) two studies linked administrative data of statutory health insurance funds (SHI) with the primary survey data. The first study evaluated the healthcare costs of allergic diseases, the second the economic efficiency of acute myocardial infarction therapy.1,2 The Study of Health in Pomerania (SHIP) linked primary survey data with data from clinical patient records, claims data and reports of physicians as well as death certificates in order to compare different stroke outcomes.3,4 The aim of linking such data is to utilise the synergy of both primary and administrative data (in Germany mostly administrative employment data and claims data from social insurance carriers). The use of these data implies high data protection requirements, however.5 Mandatory provision of information for obtaining the participants' written informed consent presents a logistic challenge. If several data sources are to be linked, information must be provided and the participants' written consent obtained for each data source separately. The German cohort study on work, age, health and work participation “leben in der Arbeit” links, for the first time in Germany, study data with (a) statutory health insurance claims data (SHI data) and (b) employment data provided by the Institute for Employment Research (IAB) of the German Federal Employment Agency (IAB data).6 While the employment data contain information on employment histories, unemployment benefits and participation in labour market programmes,6,7 the SHI data contain detailed information on inpatient and outpatient care, incapacity to work, pharmaceutical prescriptions as well as prescriptions of nonpharmaceutical therapies and technical aids.8 Actually there are 113 SHIs in Germany.9 Beside the written consent of the study participants a cooperation agreement must be concluded with each insurance company separately and the supervisory authority of the SHIs must agree to get access to the SHI data.8,10 The lidA-study closes the gap with other countries that have used data linkage for several years.11 Primary data concerning aspects of work, health and work participation were collected in two study waves (2011 and 2014) based on computer-assisted personal interviews (CAPI). Because of the detailed information about the participants health and their

employment history contained in the SHI and employment data, those aspects have been kept short within the interview. Nevertheless, both administrative data sources have a supplementary nature within lidA. All core questions can be answered using the primary data. From the addition of the administrative data we expect to gain more detailed information on diseases causing incapacity for work and their burden within certain occupations. As part of the CAPI, three written informed consents were obtained from participants in a multistep process, the consent to participate in a panel, for linkage of the IAB data with the survey data and linkage of SHI data with the survey data. First, the need to consent was explained during the CAPI. The written consent was requested subsequently to the interview in the order participation in a panel, IAB data and SHI data.6,12,13 This process may be prone to a consent bias. Previous studies which also used linkage of primary data with data from different other sources, mostly claims data, examined several factors that potentially influence the consenting behaviour, among other factors age and gender,2,14e23 region of residence,14,16e22 migrant background/ethnicity,15e18,20e23 level of education/qualifications,16e22 family status,16e20,22 income,14e18,21,23 (subjective) health status,17,18,22,23 selfreported health problems17,18,23 and the utilisation of medical services.2,16e18,22 As a result, direction and size of effects differ. Within the framework of this research, the paper aims to identify potential determinants of the willingness to give consent to data linkage within the lidA-study. Also, the analysis provides estimates of consenting bias. First representativity and selectivity analyses found only a minimal bias within the primary data,13,24 so we expect that only minimal bias within the SHI and IAB data can be traced back to the consent behaviour.

Methods The lidA-study focuses on employees born in 1959 and 1965 who were paying mandatory social security contributions as of December 31, 2009. The sample was taken from the IAB's ‘Integrated Employment Biographies’.6 During the first wave, 6585 persons participated in lidA (response rate 27.3%). Of these, 74.7% (n ¼ 4921) gave written consent for IAB data. Of 6265 participants who were insured by a SHI, 55.2% (n ¼ 3640) gave their written consent for SHI data. While 5618

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

participants gave their consent to participate in a panel, a total of 4244 participants took part in both study waves. In the 2nd wave 1720 people who did not give their consent in wave 1 for the SHI data or who had given their consent during the 1st wave and reported a change in their SHI in wave 2 were asked again to give their consent. A total of 38.1% (n ¼ 656) gave their written informed consent again or for the first time. Furthermore 651 previously non-consenting participants were asked for consent for IAB data in wave 2; 54.8% (n ¼ 357) gave their consent. With regard to SHI data, consent was requested again if participants who had given their consent during the 1st wave reported a change in their health insurer in wave 2.13,24 Because of the unique feature that in lidA a multistep consent was requested, analysing the groups with different grades of consent (no consent, consent to one of two possible data sources, consent to both data sources) is more meaningful than classifying the groups by data source (no consent, SHI data, IAB data). This also avoids an overlap of participants between the different groups. After invalid consent (n ¼ 66) had been excluded, e.g., because of a missing signature or an incorrect health insurance number, 4178 participants were left for analysis. Of these, 72.3% (n ¼ 3021) gave their consent to have study data linked with both data sources (group C1). About 20.8% (n ¼ 867; group C2) consented to data being linked with IAB data only and 6.2% (n ¼ 260; group C3) totally refused their consent. About 0.7% (n ¼ 30) agreed to have data linked with SHI data only (Fig. 1). This group was excluded from further analysis due to the small number of cases.

11

The presence of a selection bias was examined by comparing the prevalence of potential determinants of consent in the three groups C1 to C3 (Table 1).25 The three groups were analysed in turn (C1 vs C2, C1 vs C3 and C2 vs C3) using multivariate binary logistic regressions to identify factors that may influence the participants' decision to give their consent (model 1e3, Table 2). The influence of these factors was evaluated based on the odds ratio (OR and 95% confidence interval [CI]) using SPSS 22. In view of the focus of the study on work, age and health, we included CAPI information on employment (Blossfeld classification of occupations)26 and health (subjective state of health based on the 12-Item Short Form Health Survey Version 2 [SF-12v2] of the ‘socio-economic panel’,27,28 multimorbidity [aggregated number of self-reported health problems]) into the regression model. These factors were supplemented to the participants' sociodemographic data (year of birth, gender, marital status, region of residence, level of education, apprenticeship, net income and migrant background). Comparable factors of these categories can be found in the referenced literature.2,14e23 Some of the factors can be different in both waves, e.g., the participant has changed his job or got a salary increase. We decided to include the information from wave 1. For participants who renewed their consent in wave 2, because they changed their SHI or who gave their consent for the first time at this point, information from wave 2 was included in the analysis. In case of missing values the lack of information was marked as missing, no data imputations were made.

Fig. 1 e Decision treedidentifying different groups of consent for a record linkage, lidA-study wave 1 and 2.

12

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

Table 1 e Prevalences of potential consent behaviour influencing factors overall and by consent groups. Factor

Sociodemographic factors Year of birth 1959 1965 Gender Male Female Marital status Married Other status Region of residence Old federal states New federal states including Berlin Level of education Low level of education Medium level of education High level of education Apprenticeship Apprenticeship training or school-based vocational education Master craftsman, technical school or specialised/advanced technical school Technical university/university entrance qualification No/other apprenticeship Income Under V1000 V1000 to less than V2000 V2000 to less than V3000 V3000 and more Refused Blossfeld classification Skilled commercial and administrative occupations Unskilled manual occupations Skilled manual occupations Technician Engineer Unskilled services Skilled services Semiprofessions Professions Unskilled commercial administrative occupations Agricultural occupations Managers Migrant background Born in Germany Not born in Germany Migrant background of parents No immigration One parent immigrated Both parents immigrated Health-related factors Subjective state of health Very goodegood Satisfactory Less goodepoor

Prevalence Overall (n ¼ 4148)y , %

Total consent (C1, n ¼ 3021)y, %

Consent IAB (C2, n ¼ 867)y, %

Non-consent (C3, n ¼ 260)y, %

45.1 54.9

45.6 54.4

43.0 57.0

45.4 54.6

45.3 54.7

45.4 54.6

45.9 54.1

41.9 58.1

72.6 27.4

72.0 28.0

73.8 26.2

75.0 25.0

81.0 19.0

78.6 21.4

87.4 12.6

87.3 12.7

24.1 42.6 33.3

24.9 44.2 31.0

20.6 38.8 40.6

26.6 37.5 35.9

57.0

59.0

51.8

51.9

16.6

16.0

19.3

15.0

21.3

19.6

26.4

24.6

5.0

5.4

2.5

8.5

23.5 39.9 22.4 11.7 2.6

23.9 42.7 22.5 x x

21.5 32.1 21.4 x x

24.9 32.3 24.1 x x

21.3

19.9

24.9

26.1

6.5 10.2 6.5 4.6 11.3 6.6 16.8 2.3 6.6

x x x x 12.3 6.1 16.7 x x

x x x x 7.5 7.2 17.0 x x

x x x x 12.2 9.8 17.6 x x

1.3 6.1

x x

x x

x x

96.8 3.2

x x

x x

x x

84.9 6.3 8.8

84.2 x x

87.8 x x

82.6 x x

54.0 32.2 13.8

53.1 33.2 13.7

56.6 29.5 14.0

56.5 29.8 13.7

13

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

Table 1 e (continued ) Factor

Multimorbidity No disease One disease Two diseases Three and more diseases

Prevalence Overall (n ¼ 4148)y , %

Total consent (C1, n ¼ 3021)y, %

Consent IAB (C2, n ¼ 867)y, %

Non-consent (C3, n ¼ 260)y, %

17.3 23.7 22.1 37.0

16.6 23.2 22.7 37.5

20.1 25.4 20.4 34.1

16.1 23.1 20.4 40.4

y

Minor deviations of absolute number of cases in some categories due to missing values. xCannot be shown for reasons of data protection; IAB, Institute for Employment Research. Source, lidA studydwave 1 and 2, authors' own calculations.

Results First we compared the three consent groups with one another (Table 1). Between groups, there are only single-digit percentage deviations in prevalences of the factors analysed, with the exception of the income group ‘EUR 1000 to less than EUR 2000’. Participants in this income bracket were more likely to give their consent for both data sources. We quantified to what extent these factors explain the consent behaviour in a binary logistic regression. The regression models (Table 2) show that there are a few factors clearly associated with consent behaviour, but mostly with moderate odds ratios. While age, gender and marital status had no significant impact, region of residence does matter. Compared to participants from the so-called new federal states (i.e., former East Germany and Berlin) participants from the old federal states (i.e., former West Germany excl. West Berlin) were less likely to have their study data linked with both data sources (OR 0.63, 95% CI 0.42e0.96), or with the IAB data only (OR 0.50, 95% CI 0.39e0.64). Graduates of a technical school or technical or specialized/ advanced technical school are less likely to agree to have primary data linked with IAB data only (OR 0.79, 95% CI 0.62e1.00) than participants with an apprenticeship training or schoolbased vocational training (reference). Respondents without an apprenticeship or with a non-regular apprenticeship tend to refuse their consent completely (OR 0.54, 95% CI 0.31e0.92). If they do give their consent, then this applies to both data sources (OR 1.96, 95% CI 1.16e3.30). If we look at income, participants with an income above EUR 3000, in particular, (OR 0.55, 95% CI 0.39e0.78) refused to give their consent to provide SHI data. Participants who gave no information regarding their income were also substantially more likely to refuse consent to both SHI data as well as IAB data compared to the reference group (net income: less than EUR 1000; OR 0.15, 95% CI 0.08e0.30). If they decided to give their consent it was predominantly for IAB data only (OR 0.21, 95% CI 0.13e0.35). In professions, skilled commercial and administrative occupations according to Blossfeld26 served as a reference. It appears that people in skilled manual occupations (C1 vs C3 [OR 2.01, 95% CI 1.10e3.64] and C1 vs C2 [OR 1.53, 95% CI 1.07e2.19]), engineers (OR 1.56, 95% CI 1.01e2.39), people in unskilled services (OR 1.59, 95% CI 1.12e2.26) and unskilled commercial administrative occupations (OR 2.16, 95% CI 1.08e4.32) were more likely to consent to both data sources. Compared to the reference group, the managers' likelihood of giving consent is

more than twice as high regarding the IAB data (OR 2.58, 95% CI 1.18e5.63) or for both data sources (OR 2.63, 95% CI 1.23e5.59). The participants' health has a significant impact on the likelihood of giving consent. If the respondents had indicated two (OR 1.37, 95% CI 1.06e1.77) or three or more (OR 1.30, 95% CI 1.02e1.66) diseases diagnosed by a doctor, they agreed more frequently to both data sources than the reference group (no disease). The subjective health of the participants measured according to the SF-12v227,28 did not play a role in regard to giving consent. In terms of ethnic background, indicated by the feature ‘migrant background’ or ‘migrant background of parents’, no significant effect was found. With a R2 according to Nagelkerke between 0.063 and 0.085 the accuracy of the logistic model is low.

Discussion Epidemiologic surveys are dependent on the participation of the entire population or, in our case, of employed persons, to find determinants of health and to deduce evidence-based actions for disease prevention and health promotion. Limited free time and shift work may make it difficult to manage employees' participation in time-consuming studies. To record determinants of health as completely as possible, researchers resort to linking suitable data from existing data sources with the primary study data. However, linkage of such data requires the participants' consent. It is already known that several individual factors influence consent behaviour in different directions and size.14e23 This may lead to distortions in the data linked. Hence, consent analyses should be done in all studies that link study data with other data sources to exclude consent bias. The lidA-study was the first in Germany to obtain a multistep informed consent to linking study data with SHI data and IAB data. The analysis of consenting behaviour shows that known potential factors with impact on consentgiving have only a partial effect in lidA. In particular, factors such as region of residence, apprenticeship, refusal to provide income information and the number of diagnosed (often chronic) diseases proved to be statistically significant factors. This also applies to certain professional affiliations. Moreover, the statistical significance of small differences between the groups should be seen in the light of a sample size of 4244. Despite these significant effects, the accuracy of the logistic model is low. In summary the low model fit implies that the analysed determinants of consent do not lead to a serious bias

14

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

Table 2 e Factors associated with consent to data linkage, multivariate binary logistic regressions for lidA study participants by consent groups. Factor

Sociodemographic factors Year of birth 1959 (ref.) 1965 Gender Female (ref.) Male Marital status Married (ref.) Other status Region of residence New federal states Including Berlin (ref.) Old federal states Level of education Low level of education (ref.) Medium level of education High level of education Apprenticeship Apprenticeship training or school-based vocational education (ref.) Master craftsman, technical school or specialised/advanced technical school Technical university/university entrance qualification No/other apprenticeship Income (net) Under V1000 (ref.) V1000 to less than V2000 V2000 to less than V3000 V3000 and more Refused Blossfeld classification Skilled commercial and administrative occupations (ref.) Unskilled manual occupations Skilled manual occupations Technicians Engineers Unskilled services Skilled services Semiprofessions Professions Unskilled commercial administrative occupations Agricultural occupations Managers Migrant background Born in Germany (ref.) Not born in Germany Migrant background of parents No immigration (ref.) One parent immigrated Both parents immigrated Health-related factors Subjective state of health (SF-12v2) Very goodegood (ref.) Satisfactory Less goodepoor

Model 1, C1 vs C3 (n ¼ 3281)

Model 2, C1 vs C2 (n ¼ 3888)

Model 3, C2 vs C3 (n ¼ 1121)

OR (95% CI)

OR (95% CI)

OR (95% CI)

1.00 0.88 (0.67e1.16)

1.00 0.91 (0.77e1.07)

1.00 1.02 (0.75e1.39)

1.00 1.08 (0.75e1.55)

1.00 1.09 (0.87e1.36)

1.00 1.09 (0.72e1.63)

1.00 1.16 (0.83e1.61)

1.00 0.99 (0.82e1.20)

1.00 1.19 (0.83e1.71)

1.00

1.00

1.00

0.63 (0.42e0.96)*

0.50 (0.39e0.64)*

1.28 (0.79e2.05)

1.00 1.28 (0.87e1.89) 1.40 (0.87e2.25)

1.00 1.03 (0.81e1.30) 0.97 (0.73e1.29)

1.00 1.18 (0.76e1.83) 1.44 (0.84e2.44)

1.00

1.00

1.00

0.96 (0.63e1.46)

0.79 (0.62e1.00)*

1.17 (0.74e1.87)

0.63 (0.39e1.02)

0.81 (0.61e1.09)

0.73 (0.43e1.25)

0.54 (0.31e0.92)*

1.96 (1.16e3.30)*

0.32 (0.16e0.66)*

1.00 1.23 0.84 0.81 0.15

1.00 1.24 (0.97e1.57) 1.05 (0.79e1.41) 0.55 (0.39e0.78)* 0.21 (0.13e0.35)*

1.00 0.97 (0.62e1.51) 0.73 (0.43e1.22) 1.30 (0.66e2.67) 0.66 (0.33e1.34)

1.00

1.00

1.53 (1.00e2.34) 1.53 (1.07e2.19)* 1.34 (0.93e1.94) 1.56 (1.01e2.39)* 1.59 (1.12e2.26)* 0.99 (0.70e1.41) 1.10 (0.84e1.43) 1.25 (0.73e2.12) 1.18 (0.82e1.70)

1.06 (0.50e2.23) 1.25 (0.64e2.45) 1.42 (0.67e3.00) 1.07 (0.49e2.33) 0.88 (0.48e1.62) 0.83 (0.46e1.46) 1.14 (0.70e1.86) 1.92 (0.61e6.07) 1.73 (0.82e3.65)

x 2.63 (1.23e5.59)*

1.84 (0.76e4.48) 1.00 (0.71e1.42)

x 2.58 (1.18e5.63)*

1.00 0.98 (0.44e2.19)

1.00 1.38 (0.74e2.57)

1.00 0.86 (0.32e2.36)

1.00 1.17 (0.66e2.08) 0.77 (0.46e1.30)

1.00 1.37 (0.96e1.94) 1.14 (0.79e1.64)

1.00 0.91 (0.48e1.74) 0.66 (0.36e1.21)

1.00 1.19 (0.86e1.65) 1.08 (0.70e1.67)

1.00 1.00 (0.83e1.21) 0.84 (0.65e1.10)

1.00 1.17 (0.82e1.67) 1.34 (0.83e2.18)

(0.83e1.83) (0.52e1.34) (0.44e1.51) (0.08e0.30)*

1.00 1.72 2.01 1.90 1.64 1.46 0.83 1.32 2.66 2.16

(0.89e3.30) (1.10e3.64)* (0.98e3.72) (0.80e3.35) (0.85e2.49) (0.49e1.40) (0.85e2.04) (0.88e7.98) (1.08e4.32)*

15

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

Table 2 e (continued ) Factor

Multimorbidity No disease (ref.) One disease Two diseases Three and more diseases Nagelkerke's R2

Model 1, C1 vs C3 (n ¼ 3281)

Model 2, C1 vs C2 (n ¼ 3888)

Model 3, C2 vs C3 (n ¼ 1121)

OR (95% CI)

OR (95% CI)

OR (95% CI)

1.00 1.03 (0.67e1.60) 1.11 (0.71e1.73) 0.91 (0.60e1.39) 0.063

1.00 1.12 (0.88e1.44) 1.37 (1.06e1.77)* 1.30 (1.02e1.66)* 0.085

1.00 0.90 (0.56e1.44) 0.77 (0.47e1.25) 0.67 (0.42e1.07) 0.074

*

P  0.05; total sample size, 4148; C1, consent for statutory health insurance funds (SHI) data and IAB data; C2, consent for IAB data only; C3, nonconsent. xNot reported, implausible value (n ¼ 0 ‘Agricultural occupations’ in consent group C3); source, lidA-Study, authors' own calculations. SF-12v2, 12-Item Short Form Health Survey Version 2.

within the linked data. With only one exception (factor: income of EUR 1000 to less than EUR 2000) the differences in prevalence of the factors in the groups compared are also small (Table 1). Within the regression models, a possible distortion between the groups with an income of EUR 1000 to less than EUR 2000 is not significant. We assumed that health status would have a notable influence on the participants' consent behaviour. In view of our results, we can conclude that the factor ‘multimorbidity’ exerts only a small positive influence. Self-reported health status cannot be generally associated with more positive consent behaviour. This coincides with the results reported from other studies.17,18 This means for the lidA-study that a linked data set may be used in further analyses without substantial distortions. However, we cannot exclude the possibility that other factors not obtained in the CAPI or the sequence of the multistep consent (participate in a panel [only wave 1], IAB data, SHI data) may have an impact on consent-giving behaviour. These may be, for example, participants' positive or negative experiences with their SHI or the Federal Employment Agency. Within a former study, in which the participants' written consent was requested for IAB data only, the consent rate was similarly high (92.0%).14 The consent rate to SHI data (73.0%) is also on the same level as in both KORA studies (63.8% and 77.5%).2 So it can be assumed that the SHI data were classified as comparatively more sensitive than the IAB data by the participants.13,24 The broad willingness of 93.8% (n ¼ 3918) of the participants who have taken part in both waves to consent at least once and, in case of a change in health insurance scheme, to renew their consent, indicates that the interviewers and the study enjoyed a high level of trust. Despite the high administrative and legal demands, these results should encourage considering record linkage in future studies. Interested researchers can request the scientific use file of lidA since February 2016 via the Research Data Centre (FDZ) of the German Federal Employment Agency at the IAB.12,29

Author statements Ethical approval The lidA Study was approved by the Ethics Committees of the University of Wuppertal (Study Coordination) and the University of Magdeburg.

Funding This work was supported by the German Federal Ministry of Education and Research, BMBF [grant numbers 01ER0825, 01ER0826, 01ER0827 and 01ER0806].

Competing interests None declared.

references

€ wel H, Wichmann HE. KORA e a 1. Holle R, Happich M, Lo research platform for population based health research [KORA e Eine Forschungsplattform fu¨r € lkerungsbezogene Gesundheitsforschung]. bevo Gesundheitswesen 2005;67(1):19e25. € rdaten mit Daten der 2. John J, Krauth C. Verknu¨pfung von Prima gesetzlichen Krankenversicherung in € konomischen Evaluationsstudien: Erfahrungen gesundheitso aus zwei KORA-Studien. In: Swart E, Ihle P, editors. Routinedaten im Gesundheitswesen Handbuch €rdatenanalyse: Grundlagen, Methoden und Perspektiven. Sekunda 1st ed. Bern: Huber; 2005. € lzke H, Alte D, Schmidt CO, Radke D, Lorbeer R, Friedrich N, 3. Vo et al. Cohort profile: the study of health in Pomerania. Int J Epidemiol 2011;40:294e307. € lzke H, 4. Schmidt CO, Reber K, Baumeister SE, Schminke U, Vo € r- und Sekunda € rdaten in Chenot J-F. Die Integration von Prima der Study of Health in Pomerania und die Beschreibung von klinischen Endpunkten am Beispiel Schlaganfall [Integration of primary and secondary data in the Study of Health in Pomerania and description of clinical outcomes using stroke as an example]. Gesundheitswesen 2015;77(2):5. 5. March S, Rauch A, Bender S, Ihle P. Data protection aspects concerning the use of social or routine data. FDZ-Methodenreport 12/2015. 2015. Available at: http://doku.iab.de/fdz/reporte/ 2015/MR_12-15_EN.pdf [last accessed 11 November 2016]. € der H, Swart E, 6. Hasselhorn HM, Peter R, Rauch A, Schro Bender S, et al. Cohort profile: the lidA Cohort Study-a German cohort study on work, age, health and work participation. Int J Epidemiol 2014;43(6):1736e49. 7. Dorner M, Heining J, Jacobebbinghaus P, Seth S. The sample of integrated labour market biographies. Schmollers Jahrbuch: zeitschrift fu¨r Wirtschafts- und Sozialwissenschaften J Appl Soc Sci Stud 2010;130(4):599e608. 8. Swart E. Health care utilization research using secondary data. In: Janssen C, Swart E, von Lengerke T, editors. Health

16

9.

10.

11. 12.

13.

14.

15.

16.

17.

18.

p u b l i c h e a l t h 1 5 0 ( 2 0 1 7 ) 9 e1 6

care utilization in Germany. New York, NY: Springer New York; 2014. p. 63e86. GKV Spitzenverband. Krankenkassenliste [Register of German statutory health insurance funds]. Available at: https://www. gkv-spitzenverband.de/service/versicherten_service/ krankenkassenliste/krankenkassen.jsp (last accessed 04 February 2017). March S, Powietzka J, Stallmann C, Swart E. Viele Krankenkassen, Fusionen und deren Bedeutung fur die Versorgungsforschung mit Daten der Gesetzlichen Krankenversicherung in Deutschland e Erfahrungen aus der lidA-(leben in der Arbeit)-Studie [The Significance of a Large Number of Health Insurance Funds and Fusions for Health Services Research with Statutory Health Insurance Data in Germany e Experiences of the lidA Study]. Gesundheitswesen 2015;77(2):e32e6. Ferrie JE. IJE series old and new. Int J Epidemiol 2014;43(6):1689e90. Tophoven S, Wurdack A, Rauch A, Munkert C, Bauer U. lidA e leben in der Arbeit. German cohort study on work, age and health. Documentation for waves 1 and 2. FDZ-Datenreport 1/2016. 2016. Available at: http://doku.iab.de/fdz/reporte/2016/DR_01-16_ EN.pdf [last accessed 08 February 2017]. € der H, Kersting A, Gilberg R, Steinwede J. Methodenbericht Schro zur Haupterhebung lidA - leben in der Arbeit. FDZ-Methodenreport 01/2013. 2013. Available at: http://doku.iab.de/fdz/reporte/ 2013/MR_01-13.pdf [last accessed 08 February 2017]. Antoni M. Linking survey data with administrative employment data: the case of the German ALWA survey. 2011. Available at: http://www.norc.org/PDFs/October%202011%20Utilizing% 20Administrative%20Data%20Conference/4.%20Antoni% 20Linkage_October2011.pdf [last accessed 10 February 2017]. Hartmann J, Krug G. Verknu¨pfung von personenbezogenen € t durch fehlende Prozess- und Befragungsdaten e Selektivita Zustimmung der Befragten? [Record Linkage of Register and Survey Data e is there selection bias from requiring respondents to give their consent?] ZAF 2009;42(2):121e39. Huang N, Shih S, Chang H, Chou Y. Record linkage research and informed consent: who consents? BMC Health Serv Res 2007;7:18. Knies G, Burton J, Sala E. Consenting to health record linkage: evidence from a multi-purpose longitudinal survey of a general population. BMC Health Serv Res 2012;12:52. Knies G, Burton J. Analysis of four studies in a comparative framework reveals: health linkage consent rates on British cohort studies higher than on UK household panel surveys. BMC Med Res Methodol 2014;14:125.

€ der M. Consent when linking survey 19. Korbmacher JM, Schro data with administrative records: the role of the interviewer. Survey Res Methods 2013;12(2):115e31. 20. Tate AR, Calderwood L, Dezateux C, Joshi H. Mother's consent to linkage of survey data with her child's birth records in a multi-ethnic national cohort study. Int J Epidemiol 2006;35(2):294e8. €tsprozesse bei der Verknu¨pfung von Befragungs21. Beste J. Selektivita mit Prozessdaten: Record Linkage mit Daten des Panels „Arbeitsmarkt und soziale Sicherung“ und administrativen Daten der Bundesagentur fu¨r Arbeit. FDZ-Methodenreport 09/2011. 2011. Available at: http://doku.iab.de/fdz/reporte/2011/MR_09-11. pdf [last accessed 11 November 2016]. 22. Young AF, Dobson AJ, Byles JE. Health services research using linked records: who consents and what is the gain? Aust N Z J Public Health 2001;25(5):417e20. 23. Harris T, Cook DG, Victor C, Beighton C, Dewilde S, Carey I. Linking questionnaires to primary care records: factors affecting consent in older people. J Epidemiol Community Health 2005;59(4):336e8. € ring A, Schro € der H. 24. Steinwede J, Kleudgen M, Ha Methodenbericht zur Haupterhebung lidA e leben in der Arbeit, 2.Welle. FDZ-Methodenreport 07/2015. 2015. Available at: http:// doku.iab.de/fdz/reporte/2015/MR_07-15.pdf [last accessed 23 October 2015]. 25. Kreienbrock L, Pigeot I, Ahrens W. Epidemiologische methoden. 5th ed. Heidelberg: Spektrum Akademischer Verlag; 2011. 26. Blossfeld H. Labor-market entry and the sexual segregation of careers in the Federal Republic of Germany. Am J Sociol AJS 1987;93(1):89e118. 27. Ware JE, Kosinski M, Turner-Bowker DM, Gandek B. How to score version 2 of the SF-12 health survey (with a supplement documenting version 1). Lincoln, R.I., Boston, Mass: QualityMetric Inc.; Health Assessment Lab; 2002. 28. Nu¨bling M, Andersen HH, Mu¨hlbacher A. Entwicklung eines Verfahrens zur Berechnung der k€ orperlichen und psychischen Summenskalen auf Basis der SOEP-Version des SF 12 (Algorithmus). Berlin: DIW; 2006. Available at: https://www. diw.de/documents/publikationen/73/diw_01.c.44987.de/diw_ datadoc_2006-016.pdf [last accessed: 13 February 2017]. 29. Research Data Centre (FDZ) of the German Federal Employment Agency (BA) at the Institute for Employment Research. lidA e leben in der Arbeit lidA e Survey Data. Available at: http://fdz.iab.de/en/FDZ_Individual_Data/lidA.aspx [last accessed 08 February 2017].