Abstract: In this paper, I present a general modeling framework for nonparametric models with endogenous regressors and heterogeneity. I show that many existing models in the literature can be derived from a structural equation with unobserved heterogeneity by imposing constancy assumptions on the first and second derivatives. I consider a less restrictive model that imposes constancy assumptions on the second partial derivative of the structural equation. Assuming the existence of suitable instrumental variables, I provide identification results and show that the model can be estimated using a generalized control function approach. I consider an application to the estimation of the returns to education in Chile, exploiting variation across regions and cohorts in educational infrastructure and compulsory schooling laws. Using penalized spline functions to approximate the components of the average structural function, I find that the local average returns to schooling are highly nonlinear and typically underestimated by flexible models that ignore the endogeneity of schooling. I also find evidence of credential effects for high school and college graduates, and limited evidence of comparative advantage bias in the returns to certain levels of education.
Keywords: Nonparametric regression, endogenous regressors, control function, endogenous treatment, returns to schooling.
JEL classification: C14, C21, C31, J3