TU Wien:Statistik und Wahrscheinlichkeitstheorie UE (Levajkovic)/Übungen 2023W/HW12.5
Zur Navigation springen
Zur Suche springen
- Regression
A lecture was evaluated. In the file
Evaluation.Rdata
you find data of students. Acquired were first, the points achieved in the associated exercises (between 0 and 200 possible), and second, the result of the exam (in %). Can the result of the exam be explained by the points achieved in the exercises?
- (a) Plot the result of the exam () against the exercise points (). Do you observe a relation?
- (b) Compute the intercept and the slope of the regression line (without
lm()
) and plot the regression line. Comment on the meaning of the slope.- (c) Now perform the analysis using the command
lm()
. Fit the data to the model using the commandlm()
. Plot the data points and the regression line and discuss the plausibility of the model assumptions. Which result in the exam would you predict for a students that achieved 140 points in the exercises? Mark the prediction in the plot.
Dieses Beispiel ist als solved markiert. Ist dies falsch oder ungenau? Aktualisiere den Lösungsstatus (Details: Vorlage:Beispiel)
Lösungsvorschlag von Lessi[Bearbeiten | Quelltext bearbeiten]
--Lessi 2024-02-07T13:04:11Z
load("./Evaluation.Rdata")
x <- Evaluation$Uebungspunkte
y <- Evaluation$Klausurergebnis
# a) Plot the result of the exam (yi) against the exercise points (xi). Do you observe a relationship?
plot(x, y, main="Test Results against Exercise Points", xlab="Exercise Points", ylab="Test Points")
# seems to be some sort of positive relationship but the points are very scattered so there is probably low correlation
# b) Compute the intercept b0 and the slope b1 of the regression line (without lm()) and plot
# the regression line. Comment on the meaning of the slope.
r <- cor(x, y)
s_x <- sd(x)
s_y <- sd(y)
mean_x <- mean(x)
mean_y <- mean(y)
b_1 <- r * s_y / s_x
b_0 <- mean_y - b_1 * mean_x
abline(b_0, b_1)
# so as suspected there is a positive relationship but the line does not fit the data well (very few points are near the line)
# c) Now perform the analysis using the command lm(). Fit the data to the model using the
# command lm(). Plot the data points and the regression line and discuss the plausibility
# of the model assumptions. Which result in the exam would you predict for a students that
# achieved 140 points in the exercises? Mark the prediction in the plot.
plot(x, y, main="Test Results against Exercise Points", xlab="Exercise Points", ylab="Test Points")
reg <- lm(y ~ x)
reg
abline(reg)
reg$coefficients
y_140 <- sum(reg$coefficients * c(1, 140))
y_140
points(140, y_140, pch=19)
segments(140, 0, 140, y_140, lty=2)
segments(0, y_140, 140, y_140, lty=2)