# TU Wien:Multivariate Statistik VO (Filzmoser)/Multivariate Statistics Possible Exam Questions

Zur Navigation springen Zur Suche springen

## Explain model-based clustering and difficulties that could occur.

Note: We have to estimate k multivariate normal distribution covariance structures. That can potentially be a lot of parameters to estimate and lead to instability in the estimates. There are some assumptions that make the model simpler.

## What is singular value decomposition, how is it defined, and how is it related to PCA? What are the scores in terms of SVD? When would we prefer SVD to spectral decomposition of the covariance (correlation) matrix?

Let ${\displaystyle X}$ be a mean-centered matrix (columns have mean 0).

Then there exists an orthogonal ${\displaystyle (n\times n)}$ matrix ${\displaystyle U}$ and an orthogonal ${\displaystyle (p\times p)}$ matrix ${\displaystyle V}$ such that

${\displaystyle X=UDV^{\top }}$

where ${\displaystyle D}$ is an ${\displaystyle (n\times p)}$ "diagonal" matrix i.e. the only non-zero values are ${\displaystyle d_{ii},i=1,\dots ,\min(n,p)}$. The "diagonal" elements of ${\displaystyle D}$ are called singular values of ${\displaystyle X}$.

We can show that

${\displaystyle X^{\top }X=VDU^{\top }UDV^{\top }=VD\mathbf {I} DV^{\top }=VD^{2}V^{\top }}$,

which means the columns of ${\displaystyle V}$ are the eigenvectors of ${\displaystyle X^{\top }X}$ with eigenvalues ${\displaystyle D^{2}}$. Furthermore, it holds that the covariance matrix ${\displaystyle S={\frac {1}{n-1}}X^{\top }X}$, because ${\displaystyle X}$ is mean-centered. We know that in PCA,

${\displaystyle S={\hat {\Gamma }}{\hat {A}}{\hat {\Gamma }}^{\top }={\frac {1}{n-1}}X^{\top }X}$

Therefore, ${\displaystyle {\hat {\Gamma }}\equiv V}$ and ${\displaystyle (n-1){\hat {A}}=D^{2}}$. Hence, for the scores we obtain ${\displaystyle Z=(X-\mathbf {0} ){\hat {\Gamma }}=X{\hat {\Gamma }}=XV=UDV^{\top }V=UD}$.

SVD is preferable when ${\displaystyle n, which means we have more features than observations. In that case, the covariance matrix would be non-singular and the spectral decomposition theorem would not be applicable.

## How can we define the PCA problem in terms of reconstruction error (Frobenius norm)?

Note: I got this question in my exam. Only the definition was enough with a natural language explanation.

## What happens if there is the same variable in X and Y in canonical correlation?

The first canonical correlation coefficient is 1.