TU Wien:Machine Learning VU (Musliu)/Exam 2021-12-07

Aus VoWi
Zur Navigation springen Zur Suche springen

True/False:

1. In AdaBoost weights are uniformly initialized: T

2. Categorical data should be normalized before training a k-NN:

3. The error of a 1-NN classifier on the training set is 0:

4. One-vs-all is an approach to solve multi-class problems for DTs

5. Boosting ensembles can be easily parallelized

6. 1-hot encoding is used to transform numerical into categorical attributes:

7. The Pearson coefficient has a value range from -1 to 1: T

8. Off-the-shelf is a transfer learning technique that uses the output of layers from a deep-learning architecture as input for a shallow model

9. SVMs search for a decision boundary with the maximum margin

10. SVMs always find a more optimal decision boundary (hyperplane) than Perceptrons

11. In Bayesian Networks we assume that attributes are statistically independent given the class

12. Majority voting is not used when k-NN is applied for linear regression

13. Chain Rule does not simplify calculation of probabilities for BNs

14. Naive Bayes is a lazy learner

15. Normal Equation (analytical approach) is always more efficient than gradient descent for linear regression

16. knn is based on supervised paradigm

17. knn is based on unsupervised paradigm

18. one vs all is approach used by Naive Bayes



Free Text:

Compare Perceptron with SVM algorithms. Common characteristics and differences.

What are features in metalearning? What are landmarking features.

How can you learn the structure of a Bayesian Network

Explain how to deal with missing values and zero frequency problem in NB

Some maxpooling problem

Difference between micro and macro averaging measures

Describe a local search algorithm for Bayesian Network creation

Explain polynomial regression, name advantages and disadvantages compared to linear regression

What is the difference between Lasso and Ridge regression?

Which data preparation/preprocessing steps are mentioned during the lecture? describe them

No free lunch theorem. Explain.

Some basic linear kernel method (kernel 2x2 1 0 0 1 on a 4x4 data)