TU Wien:Business Intelligence VU (Tjoa)/Exam 2022-01-13
Zur Navigation springen
Zur Suche springen
Exam 2022-01-13[Bearbeiten | Quelltext bearbeiten]
The exam consisted of two parts. The exam was conducted in TUWEL because of the corona pandemic.
To attend the exam, you had to join a Zoom meeting with two devices, one filming from the front and one from the side/back. Students were let into the meeting one at a time.
Part 1: 20 Single Choice (true/false) questions[Bearbeiten | Quelltext bearbeiten]
20 true/false questions from a question pool ranging over all slides. Work time 20 minutes. Correct answer: +2 points, False answer: -1 point, No answer: 0 points.
- The soft margin parameter of SVM controls the error on the training set.
- In the DWH reference architecture, the Staging Area is a database that stores a single data extract of a source database.
- The Staging Area in the DWH reference architecture is a database that stores a single data extract of a source database.
- Age of business intelligence starts at 2010.
- The processes, technologies, and tools needed to turn data into information, information into knowledge, and knowledge into plans that drive profitable business action.
- Approaches for Information Integration - Mediator means everybody talks directly to everyone else.
- Approaches for Information Integration - Federation connects multiple (heterogenous) data sources.
- A data warehouse is a subject-oriented, integrated, time-variant, nonvolatile collection of data.
- You chose a model for deployment that has minimal training error.
- Hive allows real-time queries and has low-latency.
- The Kappa architecture is more complicated as the Lambda architecture.
- For data from a DWH you do not need data preparation and data exploration.
- F-score is a weighted score of Precision and Recall.
- ETL extraction monitoring strategies are Trigger-based, replication-based, timestamp-based, snapshot-based.
- Even a fully-grown decision tree can have impure leaf nodes.
- An advantage of Master-Slave Replication is high read-performance.
Part 2: 4 Open questions[Bearbeiten | Quelltext bearbeiten]
Work time: 25 minutes, 4 open questions. The questions were selected from a question pool. Average time to answer: 6.25 minutes, which is extremely short even for a TU exam.
- Explain main differences between Star schema and Snowflake schema.
- Explain Binning and 1-to-n encoding. Describe (a) when it is applied, (b) how it is applied, (c) give an example.
- Explain the steps of k-means algorithm. Explain the problem with initial centroid selection.
- Explain the three types of analytics.
Other possible questions - brain dump from other students, therefore some questions are repeating:
- defining precision, recall, micro/macro and explaining their differences
- defining business goal / success, mining goal / success and providing an example
- 3 analytics from dwh
- agile business intelligence
- defining precision, recall, micro/macro and explaining their differences
- What is a Lazy Learner? Give an example of a lazy learner. When is a lazy learner useful?
- What are the characteristics/requirements of OLTP Operational Systems?
- What is the difference between OLTP and OLAP?
- Discuss 2 different forms of heterogeneity
- 2 different scaling approaches, their characteristics and benefits
- two ways of doing integration in data warehousing
- k-means: algo and robustness wrt initialization
- precision recall micro macro, formula and comparison
- fasmi
- discuss 2 type of scaling in data preprocessing
- describe train test validation data, for what they are used
- describe three operation in MOLAP
- pro and cons of Kimball (bottom-down, this was written by the professor) approach
- defining precision, recall, micro/macro and explaining their differences
- 1-to-n and binning
- heterogenity concepts
- definition for DWH and 3 examples
- explain Lazy learning + example algorithm + example scenario
- 2 scaling approaches explained + when they are useful and when they are not
- 3 analytics from dwh
- advantages & Disadvantages of Kimball approach
- Difference between OLTP and OLAP and their main requirements
- Difference between Kimball model and Inmon model
- Describe two types of scaling
- types of scaling
- binning and 1-n encoding
- Solutions for information integration heterogeneity
- Snowflake vs star schema
- describe training, validation, test data
- describe 3 operations on multi dimensional data
- what is a lazy learner + example + when to use it
- definition of a DWH
- Describe metadata component in a DWH
- Inmon / Kimball
- explain MapReduce
- 1 to n binning and when to apply