Recap

Distributed Systems

  • Definition: more than one machine working together to solve a problem

  • Examples:

    • client / server: web server and web client

    • cluster: page rank computation

  • Why go distributed?

    • More computing power, more storage capacity, fault tolerance, data sharing

  • New challenges

    • System failures: need to worry about partial failure

    • Communication failure: links unreliable

      • bit errors

      • packet loss

      • node/link failure

    • Why are network sockets less reliable than pipes?

  • Communication overview

    • Raw messages: UDP

    • Reliable messages: TCP

    • Remote procedure call: RPC

  • Raw messages: UDP

    • UDP: user datagram protocol

    • API:

      • Reads and writes over socket file descriptors

        • Socket: a communication connection point (endpoint) that you can name and address in a network

      • Messages sent from / to ports to target a process on machine

    • Provide minimal reliability features

      • Messages may be lost

      • Messages may be reordered

      • Messages may be duplicated

    • Only protection: checksums to ensure data not corrupted

    • Advantages

      • Lightweight

      • Some applications make better reliability decisions themselves (e.g., video conferencing programs)

    • Disadvantages

      • More difficult to write applications correctly

  • Reliable messages: layering strategy

    • TCP: Transmission Control Protocol

    • Using software to build reliable logical connections over unreliable physical connections

    • Techniques

      • ACK: sender knows messages was received

      • TIMEOUT: how long to wait?

        • Too long: system feels unresponsive

        • Too short: messages needlessly re-sent, messages may have been dropped due to overloaded server. Resending makes overload worse!

      • Sequence numbers

        • Senders gives each message an increasing unique seq number

        • Receiver knows it has seen all messages before N

      • Buffer messages so arrive in order, timeouts are adaptive

  • RPC: remote procedure call

    • Approach: create wrappers so calling a function on another machine feels just like calling a local function! (simplify application development)

    • Help with two components

      • Runtime library

        • Thread pool

        • Socket listeners call functions on server

      • Stub generation

        • Create wrappers automatically

          • Wrappers must do conversions

            • Client arguments to message

            • Message to server arguments

            • Convert server return value to message

            • Convert message to server return value

            • Marshaling / unmarshaling, or serializing / desterilizing

        • Many tools available (rpcgen, thrift, protobufs)

    • RPC over UDP:

      • Use function return as implicit ACK

      • If function takes a long time, then send a separate ACK

Last updated