Exam 2022-01-13[Bearbeiten | Quelltext bearbeiten]

The exam consisted of two parts. The exam was conducted in TUWEL because of the corona pandemic.

To attend the exam, you had to join a Zoom meeting with two devices, one filming from the front and one from the side/back. Students were let into the meeting one at a time.

Part 1: 20 Single Choice (true/false) questions[Bearbeiten | Quelltext bearbeiten]

20 true/false questions from a question pool ranging over all slides. Work time 20 minutes. Correct answer: +2 points, False answer: -1 point, No answer: 0 points.

The soft margin parameter of SVM controls the error on the training set.
In the DWH reference architecture, the Staging Area is a database that stores a single data extract of a source database.
The Staging Area in the DWH reference architecture is a database that stores a single data extract of a source database.
Age of business intelligence starts at 2010.
The processes, technologies, and tools needed to turn data into information, information into knowledge, and knowledge into plans that drive profitable business action.
Approaches for Information Integration - Mediator means everybody talks directly to everyone else.
Approaches for Information Integration - Federation connects multiple (heterogenous) data sources.
A data warehouse is a subject-oriented, integrated, time-variant, nonvolatile collection of data.
You chose a model for deployment that has minimal training error.
Hive allows real-time queries and has low-latency.
The Kappa architecture is more complicated as the Lambda architecture.
For data from a DWH you do not need data preparation and data exploration.
F-score is a weighted score of Precision and Recall.
ETL extraction monitoring strategies are Trigger-based, replication-based, timestamp-based, snapshot-based.
Even a fully-grown decision tree can have impure leaf nodes.
An advantage of Master-Slave Replication is high read-performance.

Part 2: 4 Open questions[Bearbeiten | Quelltext bearbeiten]

Work time: 25 minutes, 4 open questions. The questions were selected from a question pool. Average time to answer: 6.25 minutes, which is extremely short even for a TU exam.

Explain main differences between Star schema and Snowflake schema.
Explain Binning and 1-to-n encoding. Describe (a) when it is applied, (b) how it is applied, (c) give an example.
Explain the steps of k-means algorithm. Explain the problem with initial centroid selection.
Explain the three types of analytics.

Other possible questions - brain dump from other students, therefore some questions are repeating:

defining precision, recall, micro/macro and explaining their differences
defining business goal / success, mining goal / success and providing an example
3 analytics from dwh
agile business intelligence

defining precision, recall, micro/macro and explaining their differences
What is a Lazy Learner? Give an example of a lazy learner. When is a lazy learner useful?
What are the characteristics/requirements of OLTP Operational Systems?
What is the difference between OLTP and OLAP?

Discuss 2 different forms of heterogeneity
2 different scaling approaches, their characteristics and benefits

two ways of doing integration in data warehousing
k-means: algo and robustness wrt initialization
precision recall micro macro, formula and comparison
fasmi

discuss 2 type of scaling in data preprocessing
describe train test validation data, for what they are used
describe three operation in MOLAP
pro and cons of Kimball (bottom-down, this was written by the professor) approach

defining precision, recall, micro/macro and explaining their differences
1-to-n and binning
heterogenity concepts
definition for DWH and 3 examples

explain Lazy learning + example algorithm + example scenario
2 scaling approaches explained + when they are useful and when they are not
3 analytics from dwh
advantages & Disadvantages of Kimball approach

Difference between OLTP and OLAP and their main requirements
Difference between Kimball model and Inmon model
Describe two types of scaling

types of scaling
binning and 1-n encoding
Solutions for information integration heterogeneity
Snowflake vs star schema

describe training, validation, test data
describe 3 operations on multi dimensional data
what is a lazy learner + example + when to use it
definition of a DWH

Describe metadata component in a DWH

Inmon / Kimball
explain MapReduce
1 to n binning and when to apply

TU Wien:Business Intelligence VU (Tjoa)/Exam 2022-01-13

Exam 2022-01-13[Bearbeiten | Quelltext bearbeiten]

Part 1: 20 Single Choice (true/false) questions[Bearbeiten | Quelltext bearbeiten]

Part 2: 4 Open questions[Bearbeiten | Quelltext bearbeiten]

Navigationsmenü