TU Wien:Statistik und Wahrscheinlichkeitstheorie UE (Levajkovic)/Übungen 2023W/HW12.5

Aus VoWi
Zur Navigation springen Zur Suche springen
Regression

A lecture was evaluated. In the file Evaluation.Rdata you find data of students. Acquired were first, the points achieved in the associated exercises (between 0 and 200 possible), and second, the result of the exam (in %). Can the result of the exam be explained by the points achieved in the exercises?

(a) Plot the result of the exam () against the exercise points (). Do you observe a relation?
(b) Compute the intercept and the slope of the regression line (without lm()) and plot the regression line. Comment on the meaning of the slope.
(c) Now perform the analysis using the command lm(). Fit the data to the model using the command lm(). Plot the data points and the regression line and discuss the plausibility of the model assumptions. Which result in the exam would you predict for a students that achieved 140 points in the exercises? Mark the prediction in the plot.
Dieses Beispiel ist als solved markiert. Ist dies falsch oder ungenau? Aktualisiere den Lösungsstatus (Details: Vorlage:Beispiel)


Lösungsvorschlag von Lessi[Bearbeiten | Quelltext bearbeiten]

--Lessi 2024-02-07T13:04:11Z

load("./Evaluation.Rdata")

x <- Evaluation$Uebungspunkte
y <- Evaluation$Klausurergebnis

# a) Plot the result of the exam (yi) against the exercise points (xi). Do you observe a relationship?

plot(x, y, main="Test Results against Exercise Points", xlab="Exercise Points", ylab="Test Points")

# seems to be some sort of positive relationship but the points are very scattered so there is probably low correlation

# b) Compute the intercept b0 and the slope b1 of the regression line (without lm()) and plot 
# the regression line. Comment on the meaning of the slope.

r <- cor(x, y)

s_x <- sd(x)
s_y <- sd(y)

mean_x <- mean(x)
mean_y <- mean(y)

b_1 <- r * s_y / s_x 
b_0 <- mean_y - b_1 * mean_x

abline(b_0, b_1)

# so as suspected there is a positive relationship but the line does not fit the data well (very few points are near the line)

# c) Now perform the analysis using the command lm(). Fit the data to the model using the
# command lm(). Plot the data points and the regression line and discuss the plausibility
# of the model assumptions. Which result in the exam would you predict for a students that
# achieved 140 points in the exercises? Mark the prediction in the plot.
 
plot(x, y, main="Test Results against Exercise Points", xlab="Exercise Points", ylab="Test Points")

reg <- lm(y ~ x)
reg

abline(reg)


reg$coefficients

y_140 <- sum(reg$coefficients * c(1, 140))
y_140


points(140, y_140, pch=19)
segments(140, 0, 140, y_140, lty=2)
segments(0, y_140, 140, y_140, lty=2)