TU Wien:Statistik und Wahrscheinlichkeitstheorie UE (Levajkovic)/Übungen 2023W/HW08.3

Effect of sample size

Consider the situation of the previous exercise except that three times the number of laymen played the game (for simplicity, literally repeat each sample 3 times). Call this new data set dist3x.

(a) Perform a t-test. What is your conclusion?
(b) Represent the data from dist.Rdata as well as dist3x each in a histogram, arranged below each other (par(mfrow=c(2,1))). Mark the mean, the 1se interval around the mean, as well as the value 550 meters.
(c) Discuss your graphic regarding the outcomes of the tests.

Lösungsvorschlag von Lessi[Bearbeiten | Quelltext bearbeiten]

dist3x <- rep(distanz, 3)

# a)

t.test(dist3x, mu = 550, conf.level = 0.95)

# The data gets more "accurate", and in this case the true mean turns out to be farther away from our hypothesised mean.
# This is also seen as the 95% CI of our test does not include mu0 anymore

# b) Represent the data from dist.Rdata as well as dist3x each in a histogram, arranged below
# each other (par(mfrow=c(2,1))). Mark the mean, the 1se interval around the mean, as
# well as the value 550 meters.

n3 <- length(dist3x)
m3 <- mean(dist3x)
s3 <- sd(dist3x)
sem3 <- s3 / sqrt(n3)


hist(distanz, breaks = seq(min(distanz), max(distanz), 20), freq = FALSE)
abline(v=550, col='blue', lwd=2)
abline(v=m, col='red', lwd=2)
arrows(m - sem, 0.004, m + sem, 0.004, code=3, length=0.1, angle=90)
text(520, 0.005, expression(mu[0]), col='blue')
text(610, 0.0049, expression(bar(x)), col='red')
text(640, 0.0043, "sem")

hist(dist3x, breaks = seq(min(dist3x), max(dist3x), 20), freq = FALSE)
abline(v=550, col='blue', lwd=2)
abline(v=m3, col='red', lwd=2)
arrows(m3 - sem3, 0.004, m3 + sem3, 0.004, code=3, length=0.1, angle=90)
text(520, 0.005, expression(mu[0]), col='blue')
text(610, 0.0049, expression(bar(x)), col='red')
text(640, 0.0043, "sem")


# c) Discuss your graphic regarding the outcomes of the tests.
# Looking at the density histogram, the data does not seem to change
# However due to the change in sample size and therefore change in standard deviation and t-value the test rejects the hypothesis
# In general more samples would also affect our presentation but the factor 3 cancels out and our histograms look the same