TU Wien:Verteilte Systeme VO (Dustdar)/Beispiel-Fragen Ausarbeitung 2014W

Aus VoWi
Zur Navigation springen Zur Suche springen

The questions are from this PDF.

Distributed systems[Bearbeiten | Quelltext bearbeiten]

Give an example of a distributed system[Bearbeiten | Quelltext bearbeiten]

Facebook is an example of a distributed system. When accessing the website you are unaware of what server you are connected to, that multiple servers are running in the background, that you are accessing replicas that are perhaps even thousands of kilometers away or any other numerous aspects.

Explain briefly its geographical distribution, communication, and naming in 3 sentences[Bearbeiten | Quelltext bearbeiten]

Geographical distribution is the physical distribution of servers and clients across an area. Communication is the purposeful exchange of messages between at least two parties. Naming involves assigning resources/servers/clients in a system a unique identifier, which is supported by a system that allows the accurate discovery of a resource/server/client based on the name.

Client-server architecture[Bearbeiten | Quelltext bearbeiten]

Draw one possible three-tiered architecture[Bearbeiten | Quelltext bearbeiten]

See graphic in PDF

Explain the placement of components in the three tiered architecture if you choose the vertical distribution style[Bearbeiten | Quelltext bearbeiten]

The logical switch handles all client requests and is the front-facing tier of a service. Client requests are passed to the second tier that contains the actual business logic of the service where computation and processing of the request takes place. The second tier accesses the third tier where the data that supports the entire service are stored.

Fundamental communication[Bearbeiten | Quelltext bearbeiten]

You are asked to explain the following architecture in which a service using several application/compute servers serves client requests

How can this architecture help to improve the performance of the service?[Bearbeiten | Quelltext bearbeiten]

The first tier is exclusively concerned with receiving and evaluating client requests. The first tier dispatches the requests to the responsible machine(s) in the second tier, which only processes them. The second tier is not concerned with who the clients are, where the requests are coming from, it merely does its job of delivering the results of a request. The third tier is then only concerned with making sure that data are consistent and ready for requests from the second tier.

How can this architecture help to reduce the failure rate of the service?[Bearbeiten | Quelltext bearbeiten]

The architecture allows transparent replication of each tier, meaning that clients are kept entirely unaware of what the back-end looks like. The data can be replicated so as to increase redundancy/speed, application servers can then be increased in number to compensate for any potential crashes and help increase the number of requests that can be processed at any given time. The front-end can be replicated and placed closer to clients so as to reduce latency times and also provide more points of access and increase the number of clients that can send requests. Each tier can also be maintained entirely independent of the others (in an ideal case), which further increases the transparency within the system itself.

Communication programming: consider Message ­oriented transient communications using Socket and using Message-­Passing Interface[Bearbeiten | Quelltext bearbeiten]

Which layers they are designed for?[Bearbeiten | Quelltext bearbeiten]

Interaction between clients and the first tier.

Which types of application styles/models they are suitable for?[Bearbeiten | Quelltext bearbeiten]

When blocking of the client is not desired, messages need to be used. This is particularly necessary for parallel applications when socket connections need to be forked off and handled in separate processes/threads while listening on another process/thread.

Naming[Bearbeiten | Quelltext bearbeiten]

List three mechanisms for name resolution.[Bearbeiten | Quelltext bearbeiten]

Flat (forward pointers, home-based or DHT), Structured (DNS) and Attribute (LDAP)

List two main advantages of recursive name resolution[Bearbeiten | Quelltext bearbeiten]

Although processing costs are higher in recursive name resolution (which results in global level servers typically only supporting iterative resolution), name servers can reliably cache results, which makes subsequent resolution faster. Communication costs can also thus be reduced once caching is implemented. The client requesting name resolution also does not have to send multiple messages and can send one simple message and wait for the response.

Time synchronization[Bearbeiten | Quelltext bearbeiten]

Compare Cristian and Berkeley algorithms for time synchronization based on...

... interactions between the time server and clients.[Bearbeiten | Quelltext bearbeiten]

Cristian algorithm requires a very exact atomic clock or some other timing mechanism. The Cristian algorithm allows clients to contact the time server to check what the current time is. The Berkeley algorithm instead actively polls clients and asks what their clocks are. In the case of Berkeley it is not assumed that the synchronized clocks are all in sync with the real world but merely that they all unanimously agree on an internal clock.

... the way to calculate and adjust the clock.[Bearbeiten | Quelltext bearbeiten]

Clocks in both algorithms are slowed or sped up identically by reducing the number of milliseconds added to the client's clock with each interrupt from its own clock in order to slow the client clock down or increasing the number of milliseconds to speed it up. No radical jumps in clocks are set as this could create inconsistencies in the client's own internal logs.

Replication and Consistency: Quorum based Protocols[Bearbeiten | Quelltext bearbeiten]

Is the marked node set shown below a valid write quorum?[Bearbeiten | Quelltext bearbeiten]

Possibilities:

  • [YES, because a write quorum of 6 and a read quorum of 9 > # total replica servers]
  • [YES, because a read quorum of (A, B, C, D, E, F, G) overlaps with the write quorum]
  • [NO, because a write quorum of (A,D,G, J, K, L) would not include C, F,or I]
  • [NO, but it would be valid when we remove C from the set of replica servers]

[NO, because a write quorum of (A,D,G, J, K, L) would not include C, F, or I ]

Explain very briefly[Bearbeiten | Quelltext bearbeiten]

  • If YES [i,ii]: draw an example of a valid read quorum size and read quorumset.
  • If NO [iii,iv]: draw a quorum (read or write set) that might lead to a problem.

Outline the other 6 nodes in the diagram. Because a true quorum does not exist (in this case Nw = 6 and Nr = 6), there is no way for the reading process to know that it positively is reading the most current copy. Nw + Nr must be greater than the total number of nodes, which means that either Nr or Nw must have a majority (greater than 50%) of all nodes in the quorum to complete the operation.

Distributed File Systems[Bearbeiten | Quelltext bearbeiten]

What is the main difference between NFS v3 and v4[Bearbeiten | Quelltext bearbeiten]

Version 3 is entirely stateless (although there were "plugins" that provided some form of statefulness) while Version 4 includes locks and stateful behavior by default. Version 4 keeps track of open files.

What benefit does this difference provide?[Bearbeiten | Quelltext bearbeiten]

This difference allows clients to cache files locally and help identify potentially stale data in the cache.

Security[Bearbeiten | Quelltext bearbeiten]

Explain the concept of strong collision resistance in the context of cryptographic hash functions.[Bearbeiten | Quelltext bearbeiten]

Strong collision resistance means that when calculating hashes two different messages should not result in the same hash.

What is a reflection attack? Explain in two sentences.[Bearbeiten | Quelltext bearbeiten]

A reflection attack involves initiating two separate parallel channels of communication with a server/client such that encrypted information from the second channel (e.g., identical encrypted challenges) can be used to fool the server/client that is communicating with a trusted party in the first channel. The server/client expects that the trusted party would be able to decrypt the information, read it and then reencrypt it before sending a challenge of its own.

Fault Tolerance & Dependability[Bearbeiten | Quelltext bearbeiten]

Define the meaning of "failure" in one sentence.[Bearbeiten | Quelltext bearbeiten]

A failure in a service is when a service no longer performs the functionality that it guarantees under normal conditions.

Define the meaning of "fault" in one sentence[Bearbeiten | Quelltext bearbeiten]

A fault is a deviation from expected behavior but can be temporary in nature and results in an error but does not necessarily always result in a failure of the system.