Previous Title Page Contents NCRVE Home


APPENDIX

Permanent Wage Estimation

We use the following model to smooth an individual's wages of short-term fluctuations: a set of fixed effects to capture the average curve of the wage profile over age, a set of random effects to isolate the heterogeneity in permanent wage gains among individuals, and a residual term to represent the transitory components of wage change within each individual profile (cf. Bernhardt, Morris, Handcock, & Scott, 1998a; Gottshalk & Moffitt, 1994; Haider, 1997; Moffitt & Gottschalk, 1995; Stevens, 1996).

The permanent and transitory components of wage-profile heterogeneity are specified as follows:

yit = µit+ eit,

where  yit is the log of the real wage of individual i in year t.The average wage profile uit is specified by

µit = ßo + ß1  lit + ß2  qit,

where  lit and  qit are the linear and quadratic age terms respectively. In this specification, we do not include any additional explanatory covariates such as education and experience because our goal is to smooth the wage trajectory of short-term fluctuations. Such covariates will be included in models to explain the growth in our permanent wage estimates. The coefficients ßo, ß1, and ß2 are average level ("fixed-effect") parameters. We have parameterized lit as the age of individual iin year t centered on age 16 and  qit as the quadratic term centered on age 16 and orthogonal to  lit. The random effects component is specified as

eit = pit + uit,

where we define pitas the permanent component and uitas the transitory component. Specifically,

pit = b0i + b1i  lit + b2i  qit.

Thus,  pit is a random quadratic representing the deviation of the individual-specific wage profile from the average wage profile. Under this parameterization,  b0i,  b1i,and  b2i represent the deviations from their fixed-effects counterparts. We model  b0i,  b1i, and  b2i as samples from a mean-zero trivariate Gaussian distribution. We assume uitis mean-zero and allow the variance of uitto vary by calendar year to capture any business cycle effects.

The individual-specific wage profile is the combination of the average wage profile and the individual-specific deviation:  µit +  pit. The parameters in our model are estimated using restricted maximum likelihood (REML) and is asymptotically efficient under the assumption of Gaussianality. The approach provides a best linear unbiased estimator (BLUE) for the individual-specific wage profile; we use these to estimate wage growthacross a twenty-year span, which we now describe.

Our outcome variable is the growth in log hourly wages from ages 16 to 36. The survey only spans a period of sixteen years, but individuals enter and complete the survey at different ages. For example, some individuals are observed from ages 14 to 30, others from ages 18 to 34, and still others from ages 21 to 37. Using all of these observations, we construct a model that predicts overall wage growth between the ages of 16 to 36. The model assumes that individual wage trajectories are well-described by a quadratic curve--a standard assumption, since this is an empirical feature of wage trajectories that has been extensively documented in labor economics (Murphy & Welch, 1990).

The extrapolations to ages 16 and 36 are based on the BLUE for each individual (cf. Robinson, 1991), but can be understood intuitively as follows: the observed portion of the wage trajectory is matched to a quadratic curve that has the same basic shape after removing short-term variation in wages. That shape over the observed period corresponds to an individual-specificshape during the unobserved periods (it is uniquely determined by the three parameters  b0i,  b1i , and  b2i). Our estimates of wage growth across an identical twenty-year age span for each individual are optimal in a statistical sense as BLUEs, and substantively they are constructed by borrowing information from all of the trajectories, so they are correctly based on observed wage growth trends.

Having established that our extrapolation scheme is statistically and substantively sound, we must consider whether the extrapolation is affected by the educational pathways. This could arise if the wage growth measure was based on different numbers of observations, depending on the pathway. Pathways involving more time in the labor market probably have more wage observations, while pathways involving less time in the labor market probably have fewer wage observations. In the latter case, our estimates would have greater uncertainty associated with them, but there is no reason to assume a priorithat they would be biased.

We computed the average number of wage observations per individual for the three pathways--(1) clean, (2) working, and (3) interrupted--and found that the latter two are nearly identical with a mean of 10.9 and 11.3 observations, respectively. The mean for clean pathways was somewhat lower, at 8.7, which is to be expected since these respondents never work while in school. This difference is not of great concern for two reasons. First, our extrapolation model ultimately makes an optimal match no matter how many wages are observed, and we should emphasize that having about nine observations per individual is a substantial amount of information.

Second, we performed a goodness-of-fit analysis over the observed portion of the data and did not find a bias due to pathway. Specifically, we compute a mean squared error (MSE) for each individual, summarizing the difference between the predicted curve and the observed wages. We then compared the means of the MSEs by pathway and found no strong differences in this goodness of fit measure. Thus, there is no evidence to suggest that our extrapolation method is better or worse for a particular pathway.


Previous Title Page Contents NCRVE Home