Difference between revisions of "TU Wien:Statistik und Wahrscheinlichkeitstheorie UE (Bura)/Übungen 2019W/5.4"

From VoWi
Jump to navigation Jump to search
(Der Seiteninhalt wurde durch einen anderen Text ersetzt: „;Unbiasedness of the empirical variance Let n ≥ 2 and <math>X_1, \dots, X_n</math> be i.i.d. (independent and identically distribu…“)
 
Line 6: Line 6:
  
 
What would have been the expectation if in <math>S^2</math> we had scaled with n instead of n − 1?
 
What would have been the expectation if in <math>S^2</math> we had scaled with n instead of n − 1?
 
+
{{ungelöst}}
== Lösungsvorschlag von [[Benutzer:Ikaly|Ikaly]] ==
 
 
 
Copied from: https://en.wikipedia.org/wiki/Bias_of_an_estimator#Examples
 
 
 
===Sample variance===
 
{{main|Sample variance}}
 
The [[sample variance]] of a random variable demonstrates two aspects of estimator bias: firstly, the naive estimator is biased, which can be corrected by a scale factor; second, the unbiased estimator is not optimal in terms of [[mean squared error]] (MSE), which can be minimized by using a different scale factor, resulting in a biased estimator with lower MSE than the unbiased estimator. Concretely, the naive estimator sums the squared deviations and divides by ''n,'' which is biased. Dividing instead by ''n''&nbsp;−&nbsp;1 yields an unbiased estimator. Conversely, MSE can be minimized by dividing by a different number (depending on distribution), but this results in a biased estimator. This number is always larger than ''n''&nbsp;−&nbsp;1, so this is known as a [[shrinkage estimator]], as it "shrinks" the unbiased estimator towards zero; for the normal distribution the optimal value is ''n''&nbsp;+&nbsp;1.
 
 
 
Suppose ''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub> are [[independent and identically distributed]] (i.i.d.) random variables with [[expected value|expectation]] ''μ'' and [[variance]] ''σ''<sup>2</sup>. If the [[sample mean]] and uncorrected [[sample variance]] are defined as
 
 
 
:<math>\overline{X}\,=\frac 1 n \sum_{i=1}^n X_i \qquad S^2=\frac 1 n \sum_{i=1}^n\big(X_i-\overline{X}\,\big)^2 \qquad </math>
 
 
 
then ''S''<sup>2</sup> is a biased estimator of ''σ''<sup>2</sup>, because
 
:<math>
 
    \begin{align}
 
    \operatorname{E}[S^2]
 
        &= \operatorname{E}\left[ \frac 1 n \sum_{i=1}^n \big(X_i-\overline{X}\big)^2 \right]
 
        = \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n \bigg((X_i-\mu)-(\overline{X}-\mu)\bigg)^2 \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n \bigg((X_i-\mu)^2 -
 
                                  2(\overline{X}-\mu)(X_i-\mu) +
 
                                  (\overline{X}-\mu)^2\bigg) \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  \frac 2 n (\overline{X}-\mu) \sum_{i=1}^n (X_i-\mu) +
 
                                  \frac 1 n (\overline{X}-\mu)^2 \sum_{i=1}^n 1  \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  \frac 2 n (\overline{X}-\mu)\sum_{i=1}^n (X_i-\mu) +
 
                                  \frac 1 n (\overline{X}-\mu)^2 \cdot n\bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  \frac 2 n (\overline{X}-\mu)\sum_{i=1}^n (X_i-\mu) +
 
                                  (\overline{X}-\mu)^2 \bigg] \\[8pt]
 
    \end{align}
 
</math>
 
To continue, we note that by subtracting <math>\mu</math> from both sides of <math>\overline{X}= \frac 1 n \sum_{i=1}^nX_i</math>, we get
 
:<math>
 
    \begin{align}
 
    \overline{X}-\mu = \frac 1 n \sum_{i=1}^n X_i - \mu = \frac 1 n \sum_{i=1}^n X_i - \frac 1 n \sum_{i=1}^n\mu\ = \frac 1 n \sum_{i=1}^n (X_i - \mu).\\[8pt]
 
    \end{align}
 
</math>
 
Meaning, (by cross-multiplication) <math>n \cdot (\overline{X}-\mu)=\sum_{i=1}^n (X_i-\mu)</math>. Then, the previous becomes:
 
:<math>
 
    \begin{align}
 
    \operatorname{E}[S^2]
 
        &=  \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  \frac 2 n (\overline{X}-\mu)\sum_{i=1}^n (X_i-\mu) +
 
                                  (\overline{X}-\mu)^2 \bigg]\\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  \frac 2 n (\overline{X}-\mu) \cdot n \cdot (\overline{X}-\mu)+
 
                                  (\overline{X}-\mu)^2 \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 -
 
                                  2(\overline{X}-\mu)^2 +
 
                                  (\overline{X}-\mu)^2 \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2 - (\overline{X}-\mu)^2 \bigg] \\[8pt]
 
        &= \operatorname{E}\bigg[ \frac 1 n \sum_{i=1}^n (X_i-\mu)^2\bigg] - \operatorname{E}\bigg[(\overline{X}-\mu)^2 \bigg] \\[8pt]
 
        &= \sigma^2 - \operatorname{E}\left[ (\overline{X}-\mu)^2 \right]
 
          = \left( 1 -\frac{1}{n}\right) \sigma^2 < \sigma^2.
 
    \end{align}
 
  </math>
 

Latest revision as of 20:48, 11 November 2019

Unbiasedness of the empirical variance

Let n ≥ 2 and X_1, \dots, X_n be i.i.d. (independent and identically distributed) random varia- bles, with \sigma^2 := \mathbb V ar(X_1) < \infty. Calculate the expectation of the empirical variance

S^2 = \frac 1 {n-1} \sum_{i=1}^n(X_i-\bar X)^2.

What would have been the expectation if in S^2 we had scaled with n instead of n − 1?

Dieses Beispiel hat noch keinen Lösungsvorschlag. Um einen zu erstellen, kopiere folgende Zeilen, bearbeite die Seite und ersetze {{ungelöst}}.

== Lösungsvorschlag von ~~~ ==
--~~~~

Siehe auch Hilfe:Formeln und Hilfe:Beispielseiten.