EMET2007 Assignment 2¶

General Instructions¶

Before you start, please check the course website for important reminders on:

  • Deadlines for submission
  • Academic honesty requirements

What You Need to Do¶

This assignment requires two types of work:

  1. Python coding — Write your code in code cells
  2. Written explanations — Provide interpretations in markdown cells (text cells)

We have included empty code and markdown cells where you should enter your solutions. Feel free to add additional cells wherever needed.

Note: You may only use Python packages that were introduced during the EMET2007 computer labs.


Your Task: Studying the Association Between Beauty and Course Evaluations¶

This assignment investigates whether a professor's physical attractiveness affects their teaching evaluations using the Teaching_Ratings dataset.

About the Data¶

The dataset contains observations on course evaluations, course characteristics, and professor characteristics for 463 courses at the University of Texas at Austin.

Good luck!


Setup: Imports and Loading Data¶

Run the cell below to import the required libraries.

In [ ]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import statsmodels.stats.api as sms
import statsmodels.formula.api as smf

Loading the Dataset¶

In [ ]:
df = pd.read_csv('https://raw.githubusercontent.com/juergenmeinecke/EMET2007/refs/heads/main/datasets/teaching_ratings.csv')

Exercise 1 (1 mark)¶

Create box plots and histograms for courseEval and beauty. Describe and interpret your findings.

Your task:

  1. Create appropriate visualizations for both variables
  2. Write an interpretation of what you observe about the distributions

Note: Add as many code cells and markdown cells as needed.

In [ ]:
# Your code here:

Your interpretation here:


Exercise 2 (1 mark)¶

Run a simple regression of courseEval on beauty and create a scatter plot with the estimated population regression function (PRF).

Your task:

  1. Estimate the regression using smf.ols()
  2. Create a scatter plot of the data
  3. Add the estimated PRF to the scatter plot
  4. Discuss: What is the direction of the association? Is the relationship statistically significant?
In [ ]:
# Your code here:

Your interpretation here:


Exercise 3 (1 mark)¶

Part (a): Predict course evaluations for two professors¶

Using your regression from Exercise 2, predict the course evaluation for:

  • Professor Watson: who has average beauty
  • Professor Stock: who has beauty one standard deviation above average

Hint: Use the .predict() method from your regression results.

In [ ]:
# Your code here:

Part (b): Interpretation of the effect size¶

Based on your predictions, is the effect of a one-standard-deviation increase in beauty on predicted course evaluations large or small? Explain your reasoning.

Your interpretation here:


Exercise 4 (1 mark)¶

Part (a): Multiple regression with control variables¶

Run a multiple regression of courseEval on beauty and the following control variables:

  • intro (1 if introductory course)
  • oneCredit (1 if one-credit course)
  • female (1 if female professor)
  • minority (1 if minority professor)
  • nnEnglish (1 if non-native English speaker)
In [ ]:
# Your code here:

Part (b): Predict Professor Smith's course evaluation¶

Professor Smith is a Black male professor with average beauty, who is a native English speaker, teaching a three-credit upper-division course.

What is Professor Smith's predicted course evaluation?

In [ ]:
# Your code here:

Part (c): Predict Professor Jones's course evaluation¶

Professor Jones has all the same characteristics as Professor Smith, except she is female.

What is Professor Jones's predicted course evaluation?

In [ ]:
# Your code here:

Exercise 5 (1 mark)¶

Run the regression from Exercise 4 again, but without female as a control variable.

Your task:

  1. Compare the coefficient on beauty in this regression to the one from Exercise 4.
  2. Discuss whether omitting female leads to omitted variable bias. What are the conditions for omitted variable bias, and do they hold here?
In [ ]:
# Your code here:

Your discussion here:


Exercise 6 (1 mark)¶

Add an interaction between beauty and female to the regression from Exercise 4. This allows the effect of beauty to differ by gender.

Your task:

  1. Estimate the regression with the interaction term
  2. Interpret the coefficient on the interaction term
  3. Is the interaction statistically significant at the 5% level?

Hint: In statsmodels, you can include an interaction using beauty * female in the formula.

In [ ]:
# Your code here:

Your interpretation here:


Exercise 7 (1 mark)¶

Professor White is a male professor who has cosmetic surgery that increases his beauty from one standard deviation below average to one standard deviation above average.

Part (a): Beauty values before and after surgery¶

What are Professor White's beauty values before and after surgery?

In [ ]:
# Your code here:

Part (b): Increase in predicted course evaluation¶

Using the regression with the interaction term from Exercise 6, what is the predicted increase in Professor White's course evaluation as a result of the surgery?

In [ ]:
# Your code here:

Part (c): 95% confidence interval for the increase¶

Construct a 95% confidence interval for the predicted increase in Professor White's course evaluation.

In [ ]:
# Your code here:

Exercise 8 (1 mark)¶

Professor Robinson is a female professor who has the same surgery as Professor White (increasing beauty from one SD below to one SD above average).

Part (a): Increase in predicted course evaluation¶

What is the predicted increase in Professor Robinson's course evaluation as a result of the surgery?

Hint: For a female professor, the effect of beauty is the sum of the coefficient on beauty and the coefficient on the interaction term.

In [ ]:
# Your code here:

Part (b): 95% confidence interval for the increase¶

Construct a 95% confidence interval for the predicted increase in Professor Robinson's course evaluation.

Hint: To get the confidence interval for the female effect directly, you can re-run the regression using a male dummy variable instead of female. Then the coefficient on beauty will represent the effect for females.

In [ ]:
# Your code here:

Exercise 9 (1 mark)¶

Part (a): Add linear and quadratic age terms¶

To the regression from Exercise 6, add both age and age squared as additional control variables.

Hint: In statsmodels, you can include a squared term using I(age**2) in the formula.

In [ ]:
# Your code here:

Part (b): Test for nonlinear and any effect of age¶

  1. Is there a nonlinear effect of age on course evaluations? (Test whether the coefficient on age-squared is zero)
  2. Is there any effect of age on course evaluations? (Test whether both age coefficients are jointly zero using an F-test)

Hint: You can use the .f_test() method for the joint test. The syntax is: results.f_test('age = I(age ** 2) = 0')

In [ ]:
# Your code here:

Your interpretation here:


Exercise 10: Submission (1 mark)¶

Submit both files on the course's Canvas page:

  1. assignment_2.ipynb — the Jupyter notebook file
  2. assignment_2.html — an HTML export of your notebook

Note: The .ipynb file is required. If you have difficulty creating the HTML file, you will not lose marks for that portion.


Attribution¶

This exercise is based on Additional Empirical Exercises 4.2, 5.2, 6.1, 7.2, and 8.1 of Stock and Watson, Introduction to Econometrics, 4th Global Edition.