TU Wien:Advanced Methods for Regression and Classification VU (Filzmoser)
|Department||Stochastik und Wirtschaftsmathematik|
|Links||tiss:105707, Homepage, Mattermost-Channel|
|Master Data Science||Pflichtmodul MLS/FD - Machine Learning and Statistics - Foundations|
This lecture is basically the english (new?) version of "Klassifikation und Diskriminanzanalyse".
Theory of models and methods for regression and classification and how to use them in R.
- Linear Regression
- Model comparison and model selection: Information criteria, etc.
- Linear Methods: OLS (ordinary least squares), PCR, PLS, Shrinkage methods (Ridge, Lasso Regression)
- Linear Methods for classification: Linear and Quadratic Discriminant Analysis (LDA, QDA), Logistic Regression
- Non-linear Methods for regression: Basis expansions (splines), Generalized Additive Models (GAM)
- Tree-based methods for regression and classification, random forests
- Support Vector Machines (SVM) for regression and classification (not much theory here, more applications)
Lectures in English. Presentation of Material from course notes + sketches on the blackboard. Material covers theoretic stuff, empirical methods, and realization / implementation in R (relevant packages, functions, code examples). Exercises to solved using R.
Some knowledge of regression is definitely nice to have, but regression is explored in-depth in all kinds of variants. In general: linear Algebra, matrix & vector calculus, basic probability theory for following the lecture notes. A certain level in R is definitely required (however, imho, R can be picked up quite fast, if you have sufficient programming skills in some language, say, python)
Explanations to the lecture notes that are used as slides / presented with projector. Sometimes extra material, more in-depth explanations with sketches on the blackboard. As ususal, (imho) a good mix of first theory and models, than more hands-on stuff on how to use the methods in R, with some concrete data.
Attending the lectures is a good idea, as the methods for the exercises are usually discussed, as well as some of the R methods needed. In the exercise classes, students' solutions are shown and discussed, which can be really helpful to learn from these solutions and/or mistakes. Students can volunteer to show their solutions.
WS 2018: 11 excercises, to be programmed in / solved with R. Topics range from regression models to classification
WS2019: Exercises are no longer mandatory, the exercises are done in class together with Prof. Filzmoser. The exercises are also not a part of the grade, only the oral exam
WS2020: 10 out of 12 exercises have to be handed in and have to have a >50% of points.
[WS 18/19] Oral exam, usually 2-3 people together. You get a sheet of paper, 2-3 questions. Everybody gets a question; for on participant, questions start right away, the other(s) have a little time to prepare.
You should write down the 1-2 most important forumulas, such as the cost function / criterium to be minimized, and possibly how the solution looks like. Be able to explain the quantities, variables, in the formulas. You should be able to explain how to arrive at the solution. Sketches of what is going on also usually helpful.
[Please add your experience, if different!]
It has to be emphasized that not being able to write down the relevant formulas will result in a significantly worse grade. For every model, the optimization problem, side conditions (if applicable), solution and other relevant formulas should be known, e.g. the degrees of freedom for smoothing splines.
Dauer der Zeugnisausstellung
- 19.12.2019: same day
- 27.02.2020: same day
A few hours for the (almost) weekly exercise assignments. The amount will depend on your previous knowledge of the theoretic material, usage of R and whether you attend the lecture or not (which is imho very helpful!)
WS 2019: During the semester, only the time for the lectures (since exercises were not mandatory). Lecture attendance is recommended, as Prof. Filzmoser explains the ideas behind the methods informally as well to make it easier to understand. IMO this is especially helpful for the nonlinear part of the lecture (basis expansions, GAM, trees, SVM). For the exam, all the relevant formulas should be known (see Materialien). Learning all this might take a week or two, depending on your affinity for linear algebra and amount of work per day.
Lecture notes in german and english available. Skriptum in Deutsch und Englisch wird zur Verfügung gestellt.
The Elements of Statistical Learning (Hastie, Tibshirani, Friedman)
An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani)
Verbesserungsvorschläge / Kritik