{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Week 3 Lab "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# **Estimating the Return to Schooling**\n",
"\n",
"One of the most often studies problems in labor economics is the **return to schooling**. In a Beckerian human capital investment model, people *choose* the level of schooling that maximizes the present value of their lifetime earnings. The main tradeoff that people face is this: more schooling gives people a higher earnings trajectory, but it also delays their entry into the labor market. For example, by studying for a PhD degree you forgo four (maybe six or eight) years of full-time work, but once you enter the labor market, you will have a relatively high earnings level.\n",
"\n",
"Labor economist have proposed a simple regression based way of estimating the return to schooling. It boils down to this specification:\n",
"\n",
"* **specification A**: $\\qquad \\log earn = \\beta_1 + \\beta_2 educ + u_A$\n",
"\n",
"If $E(u_A \\cdot educ) = 0$ then we can estimate $\\beta_2$ consistently via OLS.\n",
"\n",
"Specification A is a bit too simple, the canonical regression is:\n",
"\n",
"* **specification B**: $\\qquad \\log earn = \\beta_1 + \\beta_2 educ + \\beta_3 exper + \\beta_4 expersq/100 + u_B$,\n",
"\n",
"where $exper$ is work experience and $expersq$ is its square (why is it included?).\n",
"\n",
"In one influential paper titled *Using Geographic Variation in College Proximity to Estimate the Return to Schooling* (find the 1993 NBER version!), David Card uses the following specification:\n",
"\n",
"* **specification C**: $\\qquad \\log earn = \\beta_1 + \\beta_2 educ + \\beta_3 exper + \\beta_4 expersq/100 + \\beta_5 black + \\beta_6 south + \\beta_7 smsa + \\beta_8 smsa66 + \\beta_9 reg661 + \\cdots + \\beta_{16} reg668 + u_C$ \n",
"\n",
"Card uses the *National Longitudinal Survey of Young Men* from the US. He mostly relies on data from the year 1976.\n",
"\n",
"The extra variables in Card's model are:\n",
"\n",
"* $black$ is a dummy if the person is African American\n",
"* $south$ is a dummy if the person is from the American South (which, on average, has a lower level of development)\n",
"* $smsa$ is a dummy if the person is from a standard metropolitan statistical area (a city) in 1976 (the time of the survey)\n",
"* $smsa66$ is a dummy if the person's location was in a SMSA in 1966\n",
"* $reg661$ to $reg668$ are regional dummies\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Your Job\n",
"\n",
"Estimate Card's specification using **ordinary least squares** estimation. In particular:\n",
"* estimate $\\beta_1$ through $\\beta_{16}$\n",
"* obtain an estimate for the asymptotic covariance (assume homoskedasticity)\n",
"* obtain standard errors for the coefficient estimates\n",
"* construct a confidence interval for the return to schooling\n",
"* provide the t-statistic for the return to schooling\n",
"\n",
"Compare your results to Card's table 2. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Formulas needed:\n",
"\n",
"Before you start coding, please collect all necessary formulas here:\n",
"\n",
"* OLS estimator\n",
"* asymptotic variance matrix and its estimator\n",
"* standard errors\n",
"* confidence interval for $\\beta_2$\n",
"* t-statistic for $\\beta_2$\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Julia Coding"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading data\n",
"\n",
"Make sure to put the file `card.csv` in the same directory as your Julia notebook."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# read csv-file\n",
"using DelimitedFiles\n",
"data = readdlm(\"card.csv\", ',');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Extracting variables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# here's how you do this for the dependent variable\n",
"Y = Array{Float64}(data[:, 33])\n",
"\n",
"# now create an n-by-k matrix X by grabbing the correct columns from the data matrix"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Estimations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"# Good luck!\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.1.0",
"language": "julia",
"name": "julia-1.1"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}