Announcements

Tea with a TA

Hang out with the TAs from STA 199! This is a casual conversation and a fun opportunity to meet the members of the STA 199 teaching team. The only rule is these can’t turn into office hours!

Tea with a TA counts as a statistics experience.

Big data and public policy event on Oct 26

Mid-semester feedback + team work

Questions from the video?

We will be looking at the Paris Paintings data set in today’s application exercise. We’ll primarily focus on the variables:

Click here for the complete codebook.

paintings <- read_csv("data/paris_paintings.csv")

Exercise 1

Create a scatterplot to visualize the relationship between Height and Width. Color the points based on the whether the painting has any landscape elements. Note: Be sure to make landsAll a factor, so R treats it as categorical variable.

Exercise 2

Based on your scatterplot, does the relationship between Height and Width differ between paintings with landscape elements and those without? Briefly explain.

Exercise 3

Fit a main effects model using width and whether the painting has landscape elements to predict the height.

Exercise 4

Using your model from the previous exercise,

  • Interpret the intercept.
  • Interpret the coefficient (slope) of Width_in.
  • Interpret the coefficient (slope) of landsAll.

Exercise 5

Write the equation of the model for

  • paintings with no landscape elements.
  • paintings with landscape elements.

How do the slopes compare between the two models? How do the intercepts compare?

Exericse 6

Now let’s consider an interaction term. Fit a linear model using width, whether the painting has landscape elements, and the interaction between the two variables to predict the height.

Exercise 7

Write the equation of the model for

  • paintings with no landscape elements.
  • paintings with landscape elements.

How do the slopes compare between the two models? How do the intercepts compare?

Exercise 8

Now let’s see how well each model fits the data. Use the glance function to calculate \(R^2\) and Adj. \(R^2\) for the model fit in Ex. 3 and the one fit in Ex. 6.

Exercise 9

Based on your answer to the previous exercise, which model is a better fit for the data? Briefly explain.

Exercise 10

Interpret \(R^2\) for the model chosen in the previous exercise.