TU Wien:Experiment Design for Data Science VU (Knees)/Exam 23.01.2019 - Group B
Jump to navigation Jump to search
Exam Group B, 23.01.20
Reproducibility - What is PRIMAD, name the components and explain what "priming" each of them achieves. - Which issues are there in reproducibility (from a programming perspective)?
Trust in AI/ML - What are the benefits of the automation of decision making? - What are the issues of the automation of decision making? - What are the benefits of "black-box" algorithms? - What are the drawbacks of "black-box" algorithms?
WEKA / CV , statistical significance The data is split into train and test exactly the same way for two algorithms (classification of patents) and repeated 20 times. There was a simple WEKA workflow of loading the data, standardizing, CV, KNN , averaging results, outputting results. - What is CV? Explain it with a figure for k=4. - What is leave-one-out CV? Explain the benefits/drawbacks - What is wrong with the WEKA workflow? (standardization before CV) - Which performance measures can be used to measure the performance of the algorithms? Which one makes most sense. Explain the measure (formula) - Which statistical test can be used to compare the algorithms / test the significance? Explain why it can be used here. - Which types of errors are there? What are they? How can they be reduced/prevented?