Revisiting the shape of earnings nonresponse

Revisiting the shape of earnings nonresponse

Economics Letters 184 (2019) 108663 Contents lists available at ScienceDirect Economics Letters journal homepage: www.elsevier.com/locate/ecolet Re...

1MB Sizes 0 Downloads 22 Views

Economics Letters 184 (2019) 108663

Contents lists available at ScienceDirect

Economics Letters journal homepage: www.elsevier.com/locate/ecolet

Revisiting the shape of earnings nonresponse ∗

Mark A. Klee , Rebecca L. Chenevert 1 , Kelly R. Wilkin Social, Economic, and Housing Statistics Division, U.S. Census Bureau, 4600 Silver Hill Road, Washington, DC 20233, United States

article

info

Article history: Received 22 May 2019 Received in revised form 29 August 2019 Accepted 30 August 2019 Available online 5 September 2019 JEL classification: C8 J31 D31

a b s t r a c t Previous research shows a ‘‘U-shaped’’ relationship between earnings and survey earnings nonresponse. We demonstrate that this pattern depends upon the treatment of individuals who worked according to tax data but lack work in surveys. Including these individuals reveals a wave of earnings nonresponse that is increasing in the tails and decreasing for middle earnings quantiles. We illustrate that individuals with positive earnings in tax data yet survey reports of nonemployment lie disproportionately at the bottom of the earnings distribution, bending down the left tail of the traditional ‘‘U-shaped’’ earnings nonresponse pattern. The reporting behavior of survey nonworkers can have important implications for evaluating inequality estimates based on survey data. Published by Elsevier B.V.

Keywords: Administrative data Survey data quality Earnings Nonresponse

1. Introduction Earnings nonresponse is a longstanding issue in household surveys. The U.S. Census Bureau replaces missing earnings data with reports from observationally similar ‘‘donors’’ via hot-deck imputation. This procedure assumes independence of earnings and nonresponse conditional on the variables used to match donors to nonrespondents. Lillard et al. (1986) first conjectured that earnings nonresponse has a ‘‘U-shaped’’ relationship to earnings, implying donors likely have earnings closer to the median than nonrespondents. Kline and Santos (2013) confirmed this hypothesis using the March Current Population Survey (CPS) linked to tax data. Bollinger et al. (forthcoming) and Hokayem et al. (2015) used linked survey and tax data to illustrate that ‘‘Ushaped’’ nonresponse dampens inequality estimates and attenuates poverty rate estimates. However, these analyses consider the shape of earnings nonresponse among survey workers only. In this note, we demonstrate that this pattern depends upon the treatment of individuals who worked according to tax data but reported nonemployment to surveys in the same year. The reporting pattern of survey nonworkers potentially affects results for studies of inequality and ∗ Corresponding author. E-mail addresses: [email protected] (M.A. Klee), [email protected] (R.L. Chenevert), [email protected] (K.R. Wilkin). 1 Present Affiliation: Congressional Budget Office, United States. https://doi.org/10.1016/j.econlet.2019.108663 0165-1765/Published by Elsevier B.V.

poverty. We characterize administrative earnings of survey nonworkers and discuss how their reporting behavior affects the shape of earnings nonresponse. 2. Data We use data from the ‘‘core’’ files of the 2008 panel of the Survey of Income and Program Participation (SIPP). The SIPP is a large, nationally representative longitudinal dataset, which interviewed sampled respondents every four months for up to 64 months. SIPP primarily exists to measure individual and household income dynamics, collecting a rich set of monthly earnings variables at each interview. Total monthly earnings include those from a possible two jobs, two self-employed businesses, ‘‘moonlighting’’ work, and severance pay. We follow prior literature and identify nonresponse as any positive monthly earnings amount that is imputed, but we depart from previous work in our treatment of records with $0 earnings. Prior studies assume all survey earnings values of $0 are reported (e.g., Hokayem et al., 2014). However, this assumption ignores the relationship between a respondent’s employment status and their recorded earnings. The survey only asks earnings questions of respondents who report having employment during the reference period (U.S. Census Bureau, 2001). Individuals without survey employment are assigned $0 earnings regardless of whether their nonemployment was reported or imputed. Thus, existing studies treat individuals imputed to nonemployment as reporting $0 earnings, which confounds employment-status imputation error with earnings reporting error. Therefore, we regard $0 survey

2

M.A. Klee, R.L. Chenevert and K.R. Wilkin / Economics Letters 184 (2019) 108663

Fig. 1. The Shape of Survey Earnings Nonresponse. The left (right) panel plots the ‘‘U-shaped’’ (wave) pattern of earnings nonresponse by positive administrative earnings centile excluding (including) survey nonworkers. Source: SIPP and DER, 2009–2012.

earnings as a nonresponse if employment status is also imputed on the same record. Our administrative tax data come from the Social Security Administration’s Detailed Earnings Record (DER) for years 2009– 2012. This dataset offers uncapped earnings from two sources: (1) annual deferred and nondeferred W-2 earnings for each job held and (2) taxable self-employment income taken from Form 1040 Schedule SE for self-employed individuals. We aggregate deferred and nondeferred earnings across all DER jobs and businesses up to the person–year level. Administrative data are critical to our analysis because they provide a measure of earnings for most survey nonrespondents. However, some sample members cannot be linked to tax data due to insufficient identifying information. Bond et al. (2014) describe the linkage process and note characteristics associated with inability to link to tax data, such as mobility, minority status, and nonemployment. Following the literature, we reweight the sample by the inverse predicted probability of linkage to account for nonrandom ability to link to tax data. Our full sample consists of about 181,000 person–year observations for individuals aged 15 and over, who link to administrative records, and who were present in the SIPP sample for all months of a calendar year. To maintain consistency with annual administrative earnings, we use an indicator for whether earnings were imputed in any month of the year. 3. Results Fig. 1 plots the survey earnings nonresponse rate by centile of the positive DER earnings distribution. The left panel replicates the ‘‘U-shaped’’ pattern of earnings nonresponse in a sample

that excludes survey nonworkers. The right panel, which includes survey nonworkers with DER earnings, illustrates a wave of earnings nonresponse, increasing in the tails and decreasing for middle earnings quantiles. Fig. 2 plots coefficient estimates of an ordinary least squares regression of earnings nonresponse on DER earnings centile indicators and observables. The qualitative effect of including survey nonworkers persists after controlling for observable heterogeneity.2 Other explanatory variables include numbers of people in the household and family, a quartic in age, years since sample entry, number of records for that year in the DER, indicators for MSA size, region, sex, race and ethnicity, marital status, educational attainment, the presence of own children under 18, change in family composition, speaking a language other than English in the home, means-tested transfer program receipt, foreign-born and citizenship status, proxy response, eventual attrition, and temporarily leaving the sample at any time. Hokayem et al. (2014) also documented a wave of earnings nonresponse when including survey nonworkers from the CPS Annual Social and Economic Supplement (ASEC) linked to DER. Our contribution is to explain the deviation from the expected ‘‘U-shape’’ by documenting survey nonworkers’ prevalence and nonresponse rate to employment status questions across the positive DER earnings distribution. Fig. 3 shows that about 77% of 2 All comparisons are statistically significant at the 90-percent level. All estimates are based on responses from a sample of the population and may differ from actual values because of sampling variability or other factors. As a result, apparent differences between the estimates for two or more groups may not be statistically significant. For more information on the source of the data and the accuracy of the estimates, see https://www.census.gov/programssurveys/sipp/tech-documentation/source-accuracy-statements.html.

M.A. Klee, R.L. Chenevert and K.R. Wilkin / Economics Letters 184 (2019) 108663

3

Fig. 2. The Shape of Survey Earnings Nonresponse Controlling for Observable Heterogeneity. The left (right) panel plots the coefficient estimates from an OLS regression of an earnings nonresponse indicator on administrative earnings centile excluding (including) survey nonworkers, with spikes representing 95% confidence intervals. Source: SIPP and DER, 2009–2012.

individuals in the bottom centile lack work in SIPP. While imputation presents one potential source of the disagreement between administrative and survey measures of work, Fig. 3 also illustrates that in the bottom centile only 9.1% of survey nonworkers had their lack of work imputed.3 Individuals with relatively small, positive DER earnings disproportionately report $0 of SIPP earnings. Including survey nonworkers thus bends down the left tail of the traditional ‘‘U-shaped’’ pattern of earnings nonresponse. The reporting behavior of survey nonworkers can have important implications for evaluating inequality estimates based on survey data. Bollinger et al. (forthcoming) use a CPS–DER hybrid measure, consisting of CPS earnings for respondents and DER earnings for nonrespondents, to argue imputation plays an important role in explaining the gap between CPS ASEC and DER estimates of the 90/10 ratio. However, their sample excludes survey nonworkers, implying low earners are underrepresented among respondents and likely to have earnings hot-deck imputed from a donor closer to the median. The left panel of Fig. 4 replicates this exercise for the ratio of the 80th to the 20th earnings centiles(using )SIPP alone (θSIPP ), DER alone (θDER ), and a SIPP–DER hybrid θHybrid , with similar qualitative results.4 After including survey nonworkers with positive DER earnings, the right panel illustrates that imputation less of the difference between ( explains ) SIPP and DER estimates

θHybrid −θSIPP θDER −θSIPP

, which is statistically signif-

icant in 2009 and 2011. Interestingly, no θSIPP estimate exhibits 3 Due to the small number of survey nonworkers at the top of the DER earnings distribution, we pool two centiles to construct the percentage of survey nonworkers above the 60th centile and five centiles to construct the nonresponse rate among survey nonworkers above the 35th centile. 4 Estimates are available upon request for the 90/10, 85/15, and 75/25 ratios.

Fig. 3. The Prevalence and Response Patterns of Survey Nonworkers. The circles illustrate a sharply decreasing frequency of survey nonworkers by positive administrative earnings centile. The triangles illustrate a low earnings nonresponse rate among survey nonworkers in the bottom quintile of the positive administrative earnings distribution. Source: SIPP and DER, 2009–2012.

a statistically significant difference to the corresponding θDER estimate.

4

M.A. Klee, R.L. Chenevert and K.R. Wilkin / Economics Letters 184 (2019) 108663

Fig. 4. The Implications of Survey Nonworkers’ Reporting for Inequality Estimates. The left (right) panel plots estimates of the 80/20 ratio based on DER data only, SIPP data only, and DER-SIPP Hybrid data excluding (including) survey nonworkers with positive DER earnings. Source: SIPP and DER, 2009–2012.

4. Conclusion Analysts comparing linked survey and administrative earnings data have previously observed a ‘‘U-shaped’’ relationship between earnings nonresponse and earnings. Census imputation procedures to complete missing data implicitly assume earnings nonresponse is unrelated to earnings, conditional on the observables used to match donors to nonrespondents. Consequently, the pattern of survey earnings nonresponse has important consequences for earnings estimates among analysts who include imputed earnings values. We show that the shape of earnings nonresponse depends on the treatment of individuals who worked according to tax data but had no survey employment in the same year. These individuals fall disproportionately at the bottom of the administrative earnings distribution and exhibit relatively low nonresponse rates. Including survey nonworkers thus bends down the left tail of the traditional ‘‘U-shaped’’ earnings nonresponse pattern, resulting in a wave of earnings nonresponse and a diminished role for imputation to explain differences in SIPP and DER estimates of an inequality measure. Acknowledgments This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. We thank Adam Bee, Gary Benedetto, Jonathan Eggleston, Jonathan

Rothbaum, Chris Bollinger and Nikolas Mittag for helpful comments. Some results from this paper were previously circulated in an unpublished paper, ‘‘Do Imputed Earnings Earn Their Keep? Evaluating SIPP Earnings and Nonresponse with Administrative Records’’. The views expressed in this paper are the authors’ and not necessarily those of CBO or the U.S. Census Bureau. The Bureau has reviewed this paper for unauthorized disclosure of confidential information and has approved its disclosure avoidance practices. CBDRB-FY19-068 References Bollinger, C.R., Hirsch, C.M., Hokayem, B.T., Ziliak, J.P., 2019. Trouble in the tails? What we know about earnings nonresponse thirty years after Lillard, Smith, and Welch. J. Polit. Econ (forthcoming). Bond, B., Brown, J.D., Luque, A., O’Hara, A., 2014. The nature of the bias when studying only linkable person records: Evidence from the American Community Survey. CARRA Working Paper #2014–08. Hokayem, C., Bollinger, C., Ziliak, J.P., 2014. The role of CPS nonresponse on the level and trend in poverty. UKCPR Discussion Paper #2014–05. Hokayem, C., Bollinger, C., Ziliak, J.P., 2015. The role of CPS nonresponse in the measurement of poverty. J. Am. Stat. Assoc 110 (511), 935–945. Kline, P., Santos, A., 2013. Sensitivity to missing data assumptions: Theory and an evaluation of the U.S. wage structure. Quant. Econ 4 (2), 231–267. Lillard, L., Smith, J.P., Welch, F., 1986. What do we really know about wages? The importance of nonreporting and census imputation. J. Polit. Econ 94 (3), 489–506. U.S. Census Bureau, 2001. Survey of Income and Program Participation Users’ Guide, third ed. U.S. Census Bureau, U.S. Department of Commerce, Washington, DC.