🐣
Reading List
  • Starting point
  • Reference list
  • PhD application guidelines
  • Big Data System
    • Index
      • Architecture
        • Storage
          • Sun's Network File System (NFS)
      • Execution Engine, Resource Negotiator, Schedulers
        • Execution Engines
        • Resource Negotiator
        • Schedulers
      • Machine Learning
      • SQL Framework
      • Stream Processing
      • Graph Processing
      • Potpourri: Hardware, Serverless and Approximation
  • Operating System
    • Index
      • OSTEP
        • Virtualization
          • CPU Abstraction: the Process
          • Interlude: Process API
          • Mechanism: Limited Direct Execution
        • Intro
  • Networking
    • Index
      • CS 294 (Distributed System)
        • Week 1 - Global State and Clocks
          • Distributed Snapshots: Determining Global States of Distributed Systems
          • Time, Clocks, and the Ordering of Events in a Distributed System
        • Weak 5 - Weak Consistency
          • Dynamo: Amazon's Highly Available Key-value Store
          • Replicating Data Consistency Explained Through Baseball
          • Managing update conflicts in Bayou, a weakly connected replicated storage system
      • CS 268 (Adv Network)
        • Intro
        • Internet Architecture
          • Towards an Active Network Architecture
          • The Design Philosophy of the DARPA Internet Protocols
        • Beyond best-effort/Unicast
          • Core Based Trees (CBT)
          • Multicast Routing in Internetworks and Extended LANs
        • Congestion Control
        • SDN
          • ONIX: A Distributed Control Platform for Large-scale Production Networks
          • B4: Experience with a Globally-Deployed Software Defined WAN
          • How SDN will shape networking
          • The Future of Networking, and the Past of Protocols
        • Datacenter Networking
          • Fat tree
          • Jellyfish
        • BGP
          • The Case for Separating Routing from Routers
        • Programmable Network
          • NetCache
          • RMT
        • Datacenter Congestion Control
          • Swift
          • pFabric
        • WAN CC
          • Starvation (Sigcomm 22)
        • P2P
          • Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web
          • The Impact of DHT Routing Geometry on Resilience and Proximity
        • Net SW
          • mTCP
          • The Click modular router
        • NFV
          • Performance Interfaces for Network Functions
          • Making Middleboxes Someone Else's Problem: Network Processing as a Cloud Service
        • Ethics
          • On the morals of network research and beyond
          • The collateral damage of internet censorship by DNS injection
          • Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests
        • Low Latency
          • Aquila: A unified, low-latency fabric for datacenter networks
          • cISP: A Speed-of-Light Internet Service Provider
        • Disaggregation
          • Network Requirements for Resource Disaggregation
        • Tenant Networking
          • Invisinets
          • NetHint: While-Box Networking for Multi-Tenant Data Centers
        • Verification
          • A General Approach to Network Configuration Verification
          • Header Space Analysis: Static Checking for Networks
        • ML
          • SwitchML
          • Fast Distributed Deep Learning over RDMA
      • Computer Networking: A Top-Down Approach
        • Chapter 1. Computer Network and the Internet
          • 1.1 What Is the Internet?
          • 1.2 The Network Edge
          • 1.3 The Network Core
        • Stanford CS144
          • Chapter 1
            • 1.1 A Day in the Life of an Application
            • 1.2 The 4-Layer Internet Model
            • 1.3 The IP Service Model
            • 1.4 A Day in the Life of a Packet
            • 1.6 Layering Principle
            • 1.7 Encapsulation Principle
            • 1.8 Memory layout and Endianness
            • 1.9 IPv4 Addresses
            • 1.10 Longest Prefix Match
            • 1.11 Address Resolution Protocol (ARP)
            • 1.12 The Internet and IP Recap
      • Reading list
        • Elastic hyperparameter tuning on the cloud
        • Rethinking Networking Abstractions for Cloud Tenants
        • Democratizing Cellular Access with AnyCell
        • Dagger: Efficient and Fast RPCs in Cloud Microservices in Near-Memory Reconfigurable NICs
        • Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices
        • Faster and Cheaper Serverless Computing on Harvested Resources
        • Network-accelerated Distributed Machine Learning for Multi-Tenant Settings
        • User-Defined Cloud
        • LegoOS: A Disseminated Distributed OS for Hardware Resource Disaggregation
        • Beyond Jain's Fairness Index: Setting the Bar For The Deployment of Congestion Control Algorithms
        • IncBricks: Toward In-Network Computation with an In-Network Cache
  • Persistence
    • Index
      • Hardware
        • Enhancing Lifetime and Security of PCM-Based Main Memory with Start-Gap Wear Leveling
        • An Empirical Guide to the Behavior and Use of Scalable Persistent Memory
  • Database
    • Index
  • Group
    • WISR Group
      • Group
        • Offloading distributed applications onto smartNICs using iPipe
        • Semeru: A memory-disaggregated managed runtime
      • Cache
        • Index
          • TACK: Improving Wireless Transport Performance by Taming Acknowledgements
          • LHD: Improving Cache Hit Rate by Maximizing Hit Density
          • AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network
          • Clustered Bandits
          • Important Sampling
          • Contexual Bandits and Reinforcement Learning
          • Reinforcement Learning for Caching with Space-Time Popularity Dynamics
          • Hyperbolic Caching: Flexible Caching for Web Applications
          • Learning Cache Replacement with CACHEUS
          • Footprint Descriptors: Theory and Practice of Cache Provisioning in a Global CDN
      • Hyperparam Exploration
        • Bayesian optimization in cloud machine learning engine
    • Shivaram's Group
      • Tools
      • Group papers
        • PushdownDB: Accelerating a DBMS using S3 Computation
        • Declarative Machine Learning Systems
        • P3: Distributed Deep Graph Learning at Scale
        • Accelerating Graph Sampling for Graph Machine Learning using GPUs
        • Unicorn: A System for Searching the Social Graph
        • Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless
        • Garaph: Efficient GPU-accelerated GraphProcessing on a Single Machine with Balanced Replication
        • MOSAIC: Processing a Trillion-Edge Graph on a Single Machine
        • Fluid: Resource-aware Hyperparameter Tuning Engine
        • Lists
          • Wavelet: Efficient DNN Training with Tick-Tock Scheduling
          • GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability
          • ZeRO-Infinity and DeepSpeed: Unlocking unprecedented model scale for deep learning training
          • ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning
          • KungFu: Making Training inDistributed Machine Learning Adaptive
        • Disk ANN
      • Queries Processing
        • Building An Elastic Query Engine on Disaggregated Storage
        • GRIP: Multi-Store Capacity-Optimized High-Performance NN Search
        • Milvus: A Purpose-Built Vector Data Management System
        • Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings
        • Billion-scale Approximate Nearest Neighbor Search
        • DiskANN: Fast accurate billion-point nearest neighbor search on a single node
        • KGvec2go - Knowledge Graph Embeddings as a Service
    • Seminar & Talk
      • Berkeley System Seminar
        • RR: Engineering Record and Replay for Deployability
        • Immortal Threads: Multithreaded Event-driven Intermittent Computing on Ultra-Low-Power Microcontroll
      • Berkeley DB Seminar
        • TAOBench: An End-to-End Benchmark for Social Network Workloads
      • PS2
      • Sky Seminar Series
        • Spring 23
          • Next-Generation Optical Networks for Emerging ML Workloads
      • Reading List
        • Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks
        • Rearchitecting Linux Storage Stack for µs Latency and High Throughput
        • eBPF: rethinking the linux kernel
        • BPF for Storage: An Exokernel-Inspired Approach
        • High Velocity Kernel File Systems with Bento
        • Incremental Path Towards a Safe OS Kernel
        • Toward Reconfigurable Kernel Datapaths with Learned Optimizations
        • A Vision for Runtime Programmable Networks
        • The Demikernel and the future of kernal-bypass systems
        • Floem: A programming system for NIC-accelerated network applications
        • High Performance Data Center Operating Systems
        • Leveraging Service Meshes as a New Network Layer
        • Automatically Discovering Machine Learning Optimizations
        • Beyond Data and Model Parallelism for Deep Neural Networks
        • IOS: Inter-Operator Scheduler for CNN Acceleration
        • Building An Elastic Query Engine on Disaggregated Storage
        • Sundial: Fault-tolerant Clock Synchronization for Datacenters
        • MIND: In-Network Memory Management for Disaggregated Data Centers
        • Understanding host network stack overheads
        • From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers
        • Redesigning Storage Systems for Future Workloads Hardware and Performance Requirements
        • Are Machine Learning Cloud APIs Used Correctly?
        • Fault-tolerant and transactional stateful serverless workflows
      • Reading Groups
        • Network reading group
          • Recap
          • ML & Networking
            • Video Streaming
              • Overview
              • Reducto: On-Camera Filtering for Resource Efficient Real-Time Video Analytics
              • Learning in situ: a randomized experiment in video streaming
              • SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity
              • Neural Adaptive Video Streaming with Pensieve
              • Server-Driven Video Streaming for Deep Learning Inference
            • Congestion Control
              • ABC: A Simple Explicit Congestion Controller for Wireless Networks
              • TCP Congestion Control: A Systems Approach
                • Chapter 1: Introduction
              • A Deep Reinforcement Learning Perspective on Internet Congestion Control
              • Pantheon: the training ground for Internet congestion-control research
            • Other
              • On the Use of ML for Blackbox System Performance Prediction
              • Marauder: Synergized Caching and Prefetching for Low-Risk Mobile App Acceleration
              • Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation
              • Snicket: Query-Driven Distributed Tracing
            • Workshop
          • Homa: A Receiver-Driven Low-Latency Transport Protocol Using Network Priorities
        • DB reading group
          • CliqueMap: Productionizing an RMA-Based Distributed Caching System
          • Hash maps overview
          • Dark Silicon and the End of Multicore Scaling
        • WISR
          • pFabric: Minimal Near-Optimal Datacenter Transport
          • Scaling Distributed Machine Learning within-Network Aggregation
          • WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers
          • Data center TCP (DCTCP)
      • Wisconsin Seminar
        • Enabling Hyperscale Web Services
        • The Lottery Ticket Hypothesis
        • External Merge Sort for Top-K Queries: Eager input filtering guided by histograms
      • Stanford MLSys Seminar
        • Episode 17
        • Episode 18
  • Cloud Computing
    • Index
      • Cloud Reading Group
        • Owl: Scale and Flexibility in Distribution of Hot Contents
        • RubberBand: cloud-based hyperparameter tuning
  • Distributed System
    • Distributed Systems Lecture Series
      • 1.1 Introduction
  • Conference
    • Index
      • Stanford Graph Learning Workshop
        • Overview of Graph Representation Learning
      • NSDI 2022
      • OSDI 21
        • Graph Embeddings and Neural Networks
        • Data Management
        • Storage
        • Preview
        • Optimizations and Scheduling for ML
          • Oort: Efficient Federated Learning via Guided Participant Selection
          • PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
      • HotOS 21
        • FlexOS: Making OS Isolation Flexible
      • NSDI 21
        • Distributed System
          • Fault-Tolerant Replication with Pull-Based Consensus in MongoDB
          • Ownership: A Distributed Futures System for Fine-Grained Tasks
          • Caerus: NIMBLE Task Scheduling for Serverless Analytics
          • Ship Computer or Data? Why not both?
          • EPaxos Revisited
          • MilliSort and MilliQuery: Large-Scale Data-Intensive Computing in Milliseconds
        • TEGRA: Efficient Ad-Hoc Analytics on Evolving Graphs
        • GAIA: A System for Interactive Analysis on Distributed Graphs Using a High-Level Language
      • CIDR 21
        • Cerebro: A Layered Data Platform for Scalable Deep Learning
        • Magpie: Python at Speed and Scale using Cloud Backends
        • Lightweight Inspection of Data Preprocessingin Native Machine Learning Pipelines
        • Lakehouse: A New Generation of Open Platforms that UnifyData Warehousing and Advanced Analytics
      • MLSys 21
        • Chips and Compilers Symposium
        • Support sparse computations in ML
      • SOSP 21
        • SmartNic
          • LineFS: Efficient SmartNIC offload of a distributed file system with pipeline parallelism
          • Xenic: SmartNIC-accelerated distributed transacitions
        • Graphs
          • Mycelium: Large-Scale Distributed Graph Queries with Differential Privacy
          • dSpace: Composable Abstractions for Smart Spaces
        • Consistency
          • Efficient and Scalable Thread-Safety Violation Detection
          • Understanding and Detecting Software Upgrade Failures in Distributed Systems
        • NVM
          • HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM
        • Learning
          • Bladerunner: Stream Processing at Scale for a Live View of Backend Data Mutations at the Edge
          • Faster and Cheaper Serverless Computing on Harvested Resources
  • Random
    • Reading List
      • Random Thoughts
      • Hesse
      • Anxiety
  • Grad School
    • Index
      • Resources for undergraduate students
Powered by GitBook
On this page
  • What are the key strengths / contributions of this paper?
  • Weakness?
  • Important paper?
  • Summary of the paper (any surprising things)?
  • Other comments / thoughts

Was this helpful?

  1. Networking
  2. Index
  3. CS 268 (Adv Network)
  4. Internet Architecture

The Design Philosophy of the DARPA Internet Protocols

http://ccr.sigcomm.org/archive/1995/jan95/ccr-9501-clark.pdf

What are the key strengths / contributions of this paper?

The paper does a great job in

  1. presenting the design goals (ordered in priority) of development of Internet protocols (TCP/IP) when they were first created

  2. revealing the rationale behind the protocols' design decisions

  3. discussing what design goals are met, what remains unmet and why

  4. pointing out changes to the architecture from the point where they were created till the paper was written, comparing and reasoning about these changes

Weakness?

  1. There're a long list of goals sorted by priorities, it'll be great if the authors presented a more specific discussions on what requirements / potential designs correspond to achieving each of these goals

  2. Were there different protocol designs considered back then? If so, how were those satisfying the goals listed?

  3. Would like to see more detailed discussions on a complete set of feature changes on TCP/IP and the motivation behinds them. There is a partial presentation on this in Section 10 but a complete review would be great.

Important paper?

Yes!

When we design or think about the next-gen Internet infrastructure, it is always useful to think about under what context the original Internet was designed, and what assumptions have changed since then that might re-order the priority of the design goals and/or invalidate some of the design decisions of the old infrastructure.

The paper is important as it presents the philosophy on how TCP/IP were designed, and potentially encourages discussions on what things are different now and on revisiting the assumptions behind some of the most commonly used concepts and protocols

Summary of the paper (any surprising things)?

  • This paper attempts to capture the motivations and some of the early reasonings which shaped the Internet protocols, specifically TCP/IP. Specifically, it aims to help provide a necessary connect for current design extensions.

  • Fundamental Goal: develop an effective technique for multiplexed utilization of existing interconnected networks

    • I.e. interconnection of existing networks

    • Original setup: connect ARPANET with the ARPA packet radio network

    • V.s. a unified multi-media system

      • Pro (unified): higher degree of integration, better performance

      • Con (unified): need to integrate the existing network architectures, hard to integrate a # of administrated entities into a common utility

    • Technique for multiplexing: packet switching

      • V.s circuit switching

      • Reasoning: applications (e.g. remote login) were naturally supported, networks were to be integrated in the project were packet switching network

    • Structure of Internet is

      • a packet switched communications facility in which a number of distinguishable networks are connected together using packet communications processors called gateways which implement a store and forward packet forwarding algorithm.

  • Second Level Goals (ranked in priority)

    • Internet communication must continue despite loss of networks or gateways

    • The Internet must support multiple types of communication service

    • The Internet architecture must accommodate a variety of network

    • The Internet architecture must permit distributed management of its resources

    • The Internet architecture must be cost effective

    • The Internet architecture must permit lost attachment with a low level effort

    • The resource used in the Internet architecture must be accountable

  • Context

    • Military context

      • Hostile env --> survivability was put as the first goal, accountability as the last goal

        • Want rapid deployment without detailed accounting of the resources

        • Assumption: at the top of transport, there is only one failure, and it is total partition. The arch was to mask completely any transient failure.

        • "Fate-sharing"

          • Acceptable to lose the state information associated with the entity if, at the same time, the entity itself is lost.

          • V.s. replication in the intermediate packet switch nodes

            • Pro

              • Protect against any intermediate failures

              • Easier to engineer

          • Survivability

            • Intermediate nodes (gateways) stateless --> "datagram" network

            • More trust is placed in the host machine than in an arch where the network ensures reliable delivery of the data

            • Still

      • Second goal: support variety of types of service (at the transport service level)

        • "virtual circuit" service

          • first service provided in the Internet using TCP

          • applications: remote login (low delay in delivery, but low requirement for BW), file transfer (less concerned with delay, concerned with high-tput)

        • Example outside TCP range to support

          • XNET: cross-Internet debugger

            • debugging tool should not be reliable

            • if TCP is presumably complex, which is not expected in a debugging env (e.g. no support for timers)

          • Real time delivery of digitized speech

            • Primary requirement: not reliable service, but minimize and smooth the delay

            • Not retransmit, but just replace by a short period of silence

        • Assumption: more than one transport service would be required, the arch must be prepared --> constraint reliability, delay, or BW

          • Lead to TCP / IP separation

            • TCP: provide one particular type of service (reliable sequenced data stream)

            • IP: basic building block (i.e. datagram) which variety of service could be built

          • UPD: application-level interface to the basic datagram service

          • But hard without underlying network support

            • Problem: networks designed with particular type of service in mind were not flexible enough to support other services

      • Third goal: varieties of networks

        • Long haul (ARPANET, X.25), local area nets, broadcast satellite nets, packet radio networks, different serial links and ad hoc facilities (e.g. intercomputer busses)

        • Minimum set of assumptions

          • Basic: network can transport a packet or datagram

          • Not assume: reliable or sequenced delivery, network level broadcast or multicast, priority ranking of transmitted packet, support for multi-service, internal knowledge of failures, speeds, or delays

      • Other goals

        • Not perfectly engineered

        • Distributed management

          • E.g. not all gateways in the Internet are implemented and managed by the same agency; two-tier routing which permits gw from different administrations to exchange routing tables without trust

          • But still lack of tools for management

            • E.g. routing, decisions constrained by policies for resource usage

            • What was done: manually setting the tables --> error-prone, not powerful

        • Cost-effective

          • --> originally was put below other goals like managements and support for various network

          • Inefficiency

            • headers are long; for short packets, overhead is apparent

            • retransmissions of lost packets (need to be done end-to-end)

            • attaching host (with desired types of service) cost is high

            • poor implementation of the mechanism might hurt the network as well as the host

          • Now: commercial archs were designed for specific networks

        • Accountability

          • few tools for accounting packet flows when the paper was out

          • problem was studied to include non-military customers concerned with understanding and monitoring the usage of resources

  • Architecture and implementation

    • Major struggle: how to give guidance to the designer of a realization (i.e. related the engineering of the realization to the types of service which would result)

    • Aids

      • protocol verifiers: deal with logical correctness

        • Con: never deal with performance issues

      • simulator: takes a particular realization and explores the service which it can deliver under a variety of loadings

    • Hard to tight performance constraint

      • Goal of the arch was not to constrain performance, but to permit variability

      • No useful formal tools for describing performance

  • Datagram

    • Important as

      • Eliminates need for connection state within intermediate nodes

      • Provides a basic building block out of which different types of services can be implemented

      • Represents the minimum network service assumption, which has permitted a wide variety of networks to be incorporated into various Internet realization

  • TCP

    • Regulate delivery of bytes rather than packets

      • Motivations

        • Permit the insertion of control information into the sequence space of the bytes, so the control and the data could be acknowledged (?)

          • Use of sequence space has been dropped because of complexity

        • Permit the TCP packet to be broken up into smaller packets if necessary for fitting through a net with small packet size

          • This function was moved to the IP layer when IP split from TCP, and IP was forced to invent different methods of fragmentation

        • Permit small packets to be gathered together into one larger packet in the sending host if retransmission of the data was necessary

          • Critical

            • UNIX: lots of one byte data in packet, which would arrive must faster than a slow host could process them, this result in lost packets and retransmission

          • But

            • ACK of bytes seem to create this problem at the first place, though also have severe limit on throughput if small packets are sent

    • EOL --> PSH

      • EOL: break byte stream into records

      • Later: only data up to this point in the stream was one or more complete application-level elements

      • "one or more" rather than "exactly one"

Other comments / thoughts

  1. The paper talks about the relationship between architecture and performance, and the great difficulty that the Internet architecture designers experience in formalizing aspect of performance constraints, mainly because there was no useful formal tools for describing performance. Is this kind of tools available now?

  2. It's mentioned that back in the days when the paper was written, there is lack of sufficient tools for distributed management, especially in the area of routing; what are the new status on the development of these tools for resource managements in the context of multiple administrations?

PreviousTowards an Active Network ArchitectureNextBeyond best-effort/Unicast

Last updated 2 years ago

Was this helpful?