TU Wien:Machine Learning VU (Musliu)/Exam 2020-06-25

Work time was 90 minutes

1-12 True/False questions[Bearbeiten | Quelltext bearbeiten]

Each correctly answered question was worth 2 points, each wrongly answered question -1. 0 points for no answer.

MAE is less sensitive to outliers
Freezing layers means these layers will be fine-tuned in the fine-tuning phase
Overfit is more likely on a smaller test set
Boosting is easily parallelizeable
Paired t-tests used for folds verification in holdout method (train /test split)
Random Forests use Bootstrapping
...

13 Boosting[Bearbeiten | Quelltext bearbeiten]

A classifier called "stump" classifier is given, i.e. a 1-level decision tree (can make only one split). Given was an x-axis with 3 points x1= 1, x2 = 3, x3 = 5 (or something like that), with associated class labels -1, 1, -1.

what is the weight of each of the data points before classification?
let the first stump classifier split into two regions, i.e. draw a decision boundary
circle that data point that will get a higher weight for the second classification stage

14 Decision Boundaries[Bearbeiten | Quelltext bearbeiten]

Given a 2D dataset with 2 concentric circles, draw the decision boundaries for

Perceptron (arbitrary, does not converge)
Decision Tree
1-NN
Adaboost with a lot of classifiers/iterations

15 Hyperparameter Optimization[Bearbeiten | Quelltext bearbeiten]

Name 3 methods for hyperparameter optimization

16 Linear Regression[Bearbeiten | Quelltext bearbeiten]

Name two methods to compute coefficients in linear regression

17 KNN[Bearbeiten | Quelltext bearbeiten]

Given a 2D dataset with 10 points, compute the average error rate for the dataset with LOOCV (10-fold CV) for

a) 1-NN

b) 3-NN

Dataset looked something like this:

+ + - -

- -

+ + - -

18 1R/Bayesian network[Bearbeiten | Quelltext bearbeiten]

Given a dataset with 13 samples, 4 categorical variables (age, education, income, martial status) and target (purchase) Yes/No

a) Compute 1R on samples 1-8, and compute Precision and Accuracy for samples 9-13. Interestingly, the 1R had 1.0 accuracy and precision (split on age)

b) Propose a Bayesian Net for the full dataset and briefly argue why you chose this net. Compute some conditional probabilities in the net.