TU Wien:Business Intelligence VU (Tjoa)/Exam 2021-01-25

The exam consisted of two parts. The exam was conducted in TUWEL.

To attend the exam, you had to join a Zoom meeting with two devices, one filming from the front and one from the side/back. Students were let into the meeting one at a time. Between the parts, there was a short break (bathroom allowed).

The exact grading scheme for the exam and scaling were unknown at the time of taking the exam. Results were expected at the end of January/beginning of February.

Part 1: SC (True-False)[Bearbeiten | Quelltext bearbeiten]

20 true/false questions from a question pool ranging over all slides. Work time 20 minutes. Correct answer: +2 points, False answer: -1 point, No answer: 0 points.

K-means is extremely robust against outliers in the data
Carefully pruned decision trees usually show higher precision on the training data than un-pruned decision trees
Lazy Learning is not recommended when there is high drift in data space, leading to changing decision boundaries
The knn classifier using Euclidean distance is computationally more expensive at the model building stage than a Decision Tree using simple error counts as splitting criterion.
Ordinal data allows distances to be computed between data points
Random sampling of time series data for classifier training may lead to an overestimation of model performance
1-to-N coding (one-hot encoding) reduces the dimensionality of the feature space
CRISP-DM: Business Success Criteria are ideally specified as subjective measures and Data Mining Success Criteria should be specified as objective measures
Zero-mean unit variance normalization is highly sensitive to outliers in the data
Lazy learning is more time-efficient at classification stage
According to the Data Warehousing Institute (TDWI) working definition, Business Intelligence encompasses analytic tools
In Hadoop, applications are typically written in high-level code such as Java
In Hadoop processing is coordinated through MapReduce.
In context of the DWH reference architecture, the Metadata Component stores operational metadata, extraction and transformation metadata, and end-user metadata.
The Staging Area in the DWH reference architecture is a database that stores a single data extract of a source database.
[DWH] In a typical Lamba architecture, queries can be answered by merging results form a batch and real-time views.
Data silos hold data for individual sets for applications or organization units.
Big advantages of a snowflake schema include that the schema becomes more intuitive and browsing through the content is easy
In DWH, the concept "warehouse" supports bi-directional data flows between related data sources.
In context of DWH analytics, predictive analytics focus on investigating past effects to capture relevant information.

Part 2: Open Questions[Bearbeiten | Quelltext bearbeiten]

Work time: 25 minutes.

The questions were selected from a question pool. Two questions each from the Data Warehousing/Data Mining part, meaning 4 questions in total. Average time to answer: 6.25 minutes, which is extremely short even for a TU exam.

Example questions (analogously):

Data Warehousing[Bearbeiten | Quelltext bearbeiten]

Name three requirements of peration systems using OLTP and discribe each of them briefly
Discribe the 3 Steps of the MapReduce process flow

Data Mining[Bearbeiten | Quelltext bearbeiten]

Name 4 types of attributes used in data mining, make an exampe, discribe the characteristics and the allowed mathematical oerpations
Describe Single Linkage and Complete Linkage, how do the algorithms work and what are the characteristics

TU Wien:Business Intelligence VU (Tjoa)/Exam 2021-01-25

Inhaltsverzeichnis

Part 1: SC (True-False)[Bearbeiten | Quelltext bearbeiten]

Part 2: Open Questions[Bearbeiten | Quelltext bearbeiten]

Data Warehousing[Bearbeiten | Quelltext bearbeiten]

Data Mining[Bearbeiten | Quelltext bearbeiten]

Navigationsmenü