TU Wien:Distributed Systems Technologies VU (Truong)/Prüfung 2021-06-30
Zur Navigation springen
Zur Suche springen
Part 1[Bearbeiten | Quelltext bearbeiten]
- Suppose you need to store a large data set durably and across several cluster nodes. Furthermore, most queries involve aggregation queries (e.g., calculating the sum or average of a specific data attribute). Among the following database technologies, which is the most appropriate for this scenario?
- InfluxDB
- Apache Cassandra
- Redis
- Neo4J
- Which data source architectural pattern is reflected in the code example below?
User user = new User(); user.setName("myUser"); user.save(); // writes to a database
- Active Record
- Data Mapper
- Row Data Gateway
- Table Data Gateway
- Explain the term leaky abstraction. Give an example of a leaky abstraction and explain what makes it leaky.
- Map-Reduce is a useful programming model for implementing queries over shared or partitioned data. Explain why and give an illustrative example.
- Given the following Map--Reduce functions and input data, write down the individual steps and final results when applying the Map--Reduce query. Describe what is the role and the output of every function, from the intermediate states to the end. INPUT: A collection of structured textual files containing the hourly value of humidity and temperature from a set of sensors located in different areas..
- Explain the two dimensions of horizontal data scaling, and give for each a concrete example.
- Discuss the properties of layering for applications, and describe three use-cases with different layer distribution.
Part 2[Bearbeiten | Quelltext bearbeiten]
- Which statement(s) about the marshaller in remoting architecture are correct?
- The marshaller takes care of transporting messages between client and server
- The marshalling data format depends on the client's platform type
- The marshaller converts remote invocations into messages for the skeleton
- On the client side, the skeleton takes the role of the marshaller
- Which statement is FALSE about REST services?
- It is a protocol that guarantees a uniform way of interacting with a given server
- It allows you to use a layered system architecture
- In REST the resources are accessible via Unique Resource Identifiers
- It is an architecture that leverages the HTTP Protocol as transport layer
- Describe the two main approaches applicable for a message consumer, defining advantages and disadvantages.
- In the context of Microservices, explain the term "Chatty Interface" and what problems such an interface causes.
- Explain the relationship between Polyglot Persistence and Microservice architecture.
- Describe the key characteristics of the HTTP Interface for RESTful services, and list the five (sic!) main verbs.
Part 3[Bearbeiten | Quelltext bearbeiten]
- In the OAuth Authorization Code Grant Flow, an Access Token is
- used by the Authorization Server to authorize a User
- used by the Resource Server and Authorization Server to verify a Client
- used by the Client to receive an Access Token from the Authorization Server
- used by the Client to access a restricted resource on a User's behalf
- In Aspect-Oriented Programming, an Advice is
- an identifiable point in the program execution
- a piece of code to be executed at a specific point in the program execution
- a modularized implementation of a crosscutting concern
- an expression that matches specific points in the program execution
- Briefly explain the term "distributed transaction". Give an example of when it is necessary, and how it can be implemented.
- Briefly explain the two-phase commit (2PC) protocol and what it is used for.
- Briefly explain how you could use Annotations, Dynamic Proxies, and Dependency Injection to implement a container-managed database transaction mechanisms.
- Explain the principle of delegated authorization in distributed systems and why it is a fundamental part of modern web applications.
Part 4[Bearbeiten | Quelltext bearbeiten]
- Which statements about different scheduler architectures are correct?
- two-level schedulers allow application-specific scheduling strategies
- shared-state schedulers use pessimistic concurrency control for cluster state information
- monolithic schedulers do not require concurrency control for cluster state information
- the Kubernetes scheduler is an example of a two-level scheduler
- Which statements regarding Kubernetes are true?
- The Kubernetes master can be seen as a Virtualized Infrastructure Manager
- The Kubernetes master can be seen as an hypervisor
- Etcd supports service discovery
- The ingress controller is internally defined by kubernetes, with a well-specific set of functionality
- Describe the hardware-based virtualization, what are the possible configurations? What are their differences? Give an example for each configuration.
- Explain the role of virtualization in elasticity of distributed systems.
- Describe the three types of autoscaling, and provide an example for every one of them.
- Explain the role of the virtual infrastructure manager in cloud computing platforms.
Part 5[Bearbeiten | Quelltext bearbeiten]
- Which statement(s) about watermarking are true?
- It is a method for marking partitions in window aggregation
- It can be emitted in a punctuated or periodic fashion
- It guarantees the correct ordering of event
- It is a methodology for dealing with the unordered arrival of events
- Watermarks can be emitted only in conjunction with a particular event
- Which one(s) of the following are black box metrics
- CPU utilization
- Network traffic
- Number of logged in users
- Number of queries
- In the context of event-based architecture, what is "event-carried state transfer"? Name two consequences (e.g., benefits or drawbacks) of using the pattern.
- How does Apache Kafka enable reliability in distributed stream processing systems? (hint: think about a) operator parallelization/distribution and what happens when operator nodes fail, and b) operators that struggle with load peaks.
- What is the idea behind complex event processing? Make an example with Flink.