TU Wien:Verteilte Systeme VO (Dustdar)/Zusammenfassung

Aus VoWi
Zur Navigation springen Zur Suche springen

Introduction and Architectures[Bearbeiten | Quelltext bearbeiten]

Access Transparency
Hide differences in data representation and how an object is accessed
Location Transparency
Hide where an object is located
Relocation Transparency
Hide that an object may be moved to another location while in use
Migration Transparency
Hide that an object is replicated
Concurrency Transparency
Hide that an object may be shared by several independent users
Failure Transparency
Hide the failure and recovery of an object

Processes and Communication[Bearbeiten | Quelltext bearbeiten]

Processors, processes and threads[Bearbeiten | Quelltext bearbeiten]

To run a program, a computer creates a number of virtual processors, each of which is used to run a different program. A process is often defined as a program in execution (currently being executed on a virtual processor). Each process has a completly independent address space, seperation of processes is usually ensured through hardware support. (Concurrency transparency is given) A thread also runs its own piece of code, but shares his address space with other threads. Therefore, concurrency transparency is only given so far as it does not lead to performance degradation.
There are different ways, threads can be implemented, most importantly whether they are executed in user space, or to let the kernel be aware of them and schedule their execution.

  • Implementation in user space: since many threads are mapped to a single schedulable entity (to a process, this is also called a many-to-one threading model), a single invocation of a blocking system call can block the entire process and therefore all threads it encapsulates.
  • Implementation in the kernel: this is called a one-to-one threading model, as every single thread becomes its own schedulable entity. The downsides of this approach are that thread execution has to be managed by the kernel, therefore requiring a system call for every thread operation, and that switchting thread contexts is as difficult as switching process contexts.
Multithreaded clients[Bearbeiten | Quelltext bearbeiten]

TO ensure distribution transparency, distributed systems often have to conceal up to a couple of hundreds of milliseconds worth of latency. To achieve this, multithreaded clients can be used to execute other pieces of code, whilst the result of a blocking call (for example a request over the network, waiting for IO Input from the user or the file system) is being generated by another thread.

Multithreaded Servers[Bearbeiten | Quelltext bearbeiten]

However, the most performance gains by using multithreading can be found on server side. Instead of being completly blocked by a single operation issued by a client, and therefore being unable to serve any other clients in the meantime, the server can for example be organized in a dispatcher/worker model Here, a single dispatcher thread handles incoming requests to the server, and chooses an idel worker thread to deal with the request.

Virtualization[Bearbeiten | Quelltext bearbeiten]

TODO:
  • process VM: a program is compiled to intermediate code which is executed in a runtime
  • native VM monitor: hardware > virtual machine monitor > OS
  • hosted VM monitor: hardware > OS & virtual machine monitor > OS

Clients & servers[Bearbeiten | Quelltext bearbeiten]

TODO:

Interprocess communication[Bearbeiten | Quelltext bearbeiten]

TODO:

Remote Procedure Calls[Bearbeiten | Quelltext bearbeiten]

TODO:

Message-oriented communication[Bearbeiten | Quelltext bearbeiten]

TODO: sockets, message-oriented middleware (MOM)

Multicast communication[Bearbeiten | Quelltext bearbeiten]

TODO:

Naming[Bearbeiten | Quelltext bearbeiten]

Flat naming[Bearbeiten | Quelltext bearbeiten]

TODO: Finger Tables (FTs), Hierarchical Location Services (HLS)

Structured naming[Bearbeiten | Quelltext bearbeiten]

TODO: hard links vs symbolic links, layers (global, administrational, managerial), iterative vs recursive resolution, DNS

Attribute-based naming[Bearbeiten | Quelltext bearbeiten]

TODO: e.g. LDAP

Fault Tolerance[Bearbeiten | Quelltext bearbeiten]

TODO: fault -> error -> failure, failure models, redundancy, Paxos

Synchronization and Coordination[Bearbeiten | Quelltext bearbeiten]

TODO: NTP, Lamport's Logical Clocks

Consistency and Replication[Bearbeiten | Quelltext bearbeiten]

Data-centric consistency[Bearbeiten | Quelltext bearbeiten]

Models:

  • sequential consistency: all processes perceive the same global sequence of write operations
  • causal consistency: all processes perceive potentially causally related writes in the same order
potentially related: process writes after reading value written from other process, writes of the same process
  • eventual consistency: after a write, the replicas will gradually become consistent

Protocols:

  • primary-based protocols
    • primary backup protocol with remote writes
    • primary backup protocol with local writes
  • replicated-write protocols
    • quorum based protocols

Client-centric consistency[Bearbeiten | Quelltext bearbeiten]

Models:

  • monotonic read consistency
  • monotonic write consistency
  • "read your writes" consistency
  • "writes follow reads" consistency

Protocols:

  • monotonic reads
  • monotonic writes

Replica management[Bearbeiten | Quelltext bearbeiten]

Replica server placement[Bearbeiten | Quelltext bearbeiten]

TODO:
Greedy algorithm based on client locations
  1. Place first server at location with minimum latency to all clients.
  2. Place next server such that latency to all clients is minimized, assuming that each client is served by the closest (lowest-latency) server.
  3. Continue like this until all servers have been placed.
Greedy algorithm based on Internet's autonomous-system-level topology
  1. Find largest AS
  2. Find its router with the largest number of connections and place a server there.
  3. Continue with the second largest AS and place a server at its router with the largest number of connections.
  4. Continue like this until all servers have been placed.

Server-initiated content replication[Bearbeiten | Quelltext bearbeiten]

TODO:

Content distribution - general aspects[Bearbeiten | Quelltext bearbeiten]

General strategies:

  • invalidation: notify replicas that there was an update (but don't send updates)
  • data transfer: send updated data to replicas on update
  • active replication: send difference to replicas on update

Client-initiated content replication[Bearbeiten | Quelltext bearbeiten]

TODO:

Security[Bearbeiten | Quelltext bearbeiten]

Introduction[Bearbeiten | Quelltext bearbeiten]

TODO:

Secure channels[Bearbeiten | Quelltext bearbeiten]

TODO:
Diffie-Hellman
  1. Alice and Bob agree on two large numbers: n (prime) and g. Both numbers may be public.
  2. Alice chooses large number x and Bob chooses large number y.
  3. Alice sends n, g, g^x mod n
  4. Bob replies with g^y mod n
  5. Alice and Bob can compute the shared secret

Access control[Bearbeiten | Quelltext bearbeiten]

TODO:

for objects:

  • access control matrix (ACM)
  • access control list (ACL)
  • capabilities

for networks: firewalls

Selected topics[Bearbeiten | Quelltext bearbeiten]

Secure group communication[Bearbeiten | Quelltext bearbeiten]

TODO:

Secure mobile code[Bearbeiten | Quelltext bearbeiten]

TODO:

Secure naming[Bearbeiten | Quelltext bearbeiten]

TODO:

Securing the DNS:

  • DNS Security Extensions (DNSSEC)
  • DNS over HTTPS (DoH)
  • DNS over TLS (DoT)

Security aspects of time synchronization[Bearbeiten | Quelltext bearbeiten]

TODO:

Current Trends[Bearbeiten | Quelltext bearbeiten]

TODO: Service-oriented architectures (SOA)

NIST service models:

  • Infrastructure as a Service (IaaS)
  • Platform as a Service (PaaS)
  • Software as a Service (SaaS)

NIST deployment models:

  • private cloud
  • community cloud
  • public cloud
  • hybrid cloud

Today's architecture has three levels:

  • cloud computing
  • fog computing
  • edge computing

Edge computing is necessary for highly latency-sensitive applications that cannot afford network latency.