TU Wien:Machine Learning VU (Musliu)/Exam-2022-06-30

Aus VoWi
Zur Navigation springen Zur Suche springen

- K armed bandit problem; there were 5 Actions and rewards (e.g. A1 =2, R1=3, A2,1, R2=-1,...). You had to say at which timesteps the action was definitely chosen randomly and at which the action was possibly chosen randomly


- GD problem; there were 2 features and 1 target (5 samples).

f1 f2 t

1 3 12

2 5 9 (these are example values - I dont remeber the exact values...)

.. .. ..

.. .. ..


The coefficients w0, w1 and w2 had to be calculated when using the RSS as metric. Further, a learning rate was given (a = 0.5). When in the first step all w are 0, what will w1 be in the second step?