🐣
Reading List
  • Starting point
  • Reference list
  • PhD application guidelines
  • Big Data System
    • Index
      • Architecture
        • Storage
          • Sun's Network File System (NFS)
      • Execution Engine, Resource Negotiator, Schedulers
        • Execution Engines
        • Resource Negotiator
        • Schedulers
      • Machine Learning
      • SQL Framework
      • Stream Processing
      • Graph Processing
      • Potpourri: Hardware, Serverless and Approximation
  • Operating System
    • Index
      • OSTEP
        • Virtualization
          • CPU Abstraction: the Process
          • Interlude: Process API
          • Mechanism: Limited Direct Execution
        • Intro
  • Networking
    • Index
      • CS 294 (Distributed System)
        • Week 1 - Global State and Clocks
          • Distributed Snapshots: Determining Global States of Distributed Systems
          • Time, Clocks, and the Ordering of Events in a Distributed System
        • Weak 5 - Weak Consistency
          • Dynamo: Amazon's Highly Available Key-value Store
          • Replicating Data Consistency Explained Through Baseball
          • Managing update conflicts in Bayou, a weakly connected replicated storage system
      • CS 268 (Adv Network)
        • Intro
        • Internet Architecture
          • Towards an Active Network Architecture
          • The Design Philosophy of the DARPA Internet Protocols
        • Beyond best-effort/Unicast
          • Core Based Trees (CBT)
          • Multicast Routing in Internetworks and Extended LANs
        • Congestion Control
        • SDN
          • ONIX: A Distributed Control Platform for Large-scale Production Networks
          • B4: Experience with a Globally-Deployed Software Defined WAN
          • How SDN will shape networking
          • The Future of Networking, and the Past of Protocols
        • Datacenter Networking
          • Fat tree
          • Jellyfish
        • BGP
          • The Case for Separating Routing from Routers
        • Programmable Network
          • NetCache
          • RMT
        • Datacenter Congestion Control
          • Swift
          • pFabric
        • WAN CC
          • Starvation (Sigcomm 22)
        • P2P
          • Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web
          • The Impact of DHT Routing Geometry on Resilience and Proximity
        • Net SW
          • mTCP
          • The Click modular router
        • NFV
          • Performance Interfaces for Network Functions
          • Making Middleboxes Someone Else's Problem: Network Processing as a Cloud Service
        • Ethics
          • On the morals of network research and beyond
          • The collateral damage of internet censorship by DNS injection
          • Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests
        • Low Latency
          • Aquila: A unified, low-latency fabric for datacenter networks
          • cISP: A Speed-of-Light Internet Service Provider
        • Disaggregation
          • Network Requirements for Resource Disaggregation
        • Tenant Networking
          • Invisinets
          • NetHint: While-Box Networking for Multi-Tenant Data Centers
        • Verification
          • A General Approach to Network Configuration Verification
          • Header Space Analysis: Static Checking for Networks
        • ML
          • SwitchML
          • Fast Distributed Deep Learning over RDMA
      • Computer Networking: A Top-Down Approach
        • Chapter 1. Computer Network and the Internet
          • 1.1 What Is the Internet?
          • 1.2 The Network Edge
          • 1.3 The Network Core
        • Stanford CS144
          • Chapter 1
            • 1.1 A Day in the Life of an Application
            • 1.2 The 4-Layer Internet Model
            • 1.3 The IP Service Model
            • 1.4 A Day in the Life of a Packet
            • 1.6 Layering Principle
            • 1.7 Encapsulation Principle
            • 1.8 Memory layout and Endianness
            • 1.9 IPv4 Addresses
            • 1.10 Longest Prefix Match
            • 1.11 Address Resolution Protocol (ARP)
            • 1.12 The Internet and IP Recap
      • Reading list
        • Elastic hyperparameter tuning on the cloud
        • Rethinking Networking Abstractions for Cloud Tenants
        • Democratizing Cellular Access with AnyCell
        • Dagger: Efficient and Fast RPCs in Cloud Microservices in Near-Memory Reconfigurable NICs
        • Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices
        • Faster and Cheaper Serverless Computing on Harvested Resources
        • Network-accelerated Distributed Machine Learning for Multi-Tenant Settings
        • User-Defined Cloud
        • LegoOS: A Disseminated Distributed OS for Hardware Resource Disaggregation
        • Beyond Jain's Fairness Index: Setting the Bar For The Deployment of Congestion Control Algorithms
        • IncBricks: Toward In-Network Computation with an In-Network Cache
  • Persistence
    • Index
      • Hardware
        • Enhancing Lifetime and Security of PCM-Based Main Memory with Start-Gap Wear Leveling
        • An Empirical Guide to the Behavior and Use of Scalable Persistent Memory
  • Database
    • Index
  • Group
    • WISR Group
      • Group
        • Offloading distributed applications onto smartNICs using iPipe
        • Semeru: A memory-disaggregated managed runtime
      • Cache
        • Index
          • TACK: Improving Wireless Transport Performance by Taming Acknowledgements
          • LHD: Improving Cache Hit Rate by Maximizing Hit Density
          • AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network
          • Clustered Bandits
          • Important Sampling
          • Contexual Bandits and Reinforcement Learning
          • Reinforcement Learning for Caching with Space-Time Popularity Dynamics
          • Hyperbolic Caching: Flexible Caching for Web Applications
          • Learning Cache Replacement with CACHEUS
          • Footprint Descriptors: Theory and Practice of Cache Provisioning in a Global CDN
      • Hyperparam Exploration
        • Bayesian optimization in cloud machine learning engine
    • Shivaram's Group
      • Tools
      • Group papers
        • PushdownDB: Accelerating a DBMS using S3 Computation
        • Declarative Machine Learning Systems
        • P3: Distributed Deep Graph Learning at Scale
        • Accelerating Graph Sampling for Graph Machine Learning using GPUs
        • Unicorn: A System for Searching the Social Graph
        • Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless
        • Garaph: Efficient GPU-accelerated GraphProcessing on a Single Machine with Balanced Replication
        • MOSAIC: Processing a Trillion-Edge Graph on a Single Machine
        • Fluid: Resource-aware Hyperparameter Tuning Engine
        • Lists
          • Wavelet: Efficient DNN Training with Tick-Tock Scheduling
          • GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability
          • ZeRO-Infinity and DeepSpeed: Unlocking unprecedented model scale for deep learning training
          • ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning
          • KungFu: Making Training inDistributed Machine Learning Adaptive
        • Disk ANN
      • Queries Processing
        • Building An Elastic Query Engine on Disaggregated Storage
        • GRIP: Multi-Store Capacity-Optimized High-Performance NN Search
        • Milvus: A Purpose-Built Vector Data Management System
        • Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings
        • Billion-scale Approximate Nearest Neighbor Search
        • DiskANN: Fast accurate billion-point nearest neighbor search on a single node
        • KGvec2go - Knowledge Graph Embeddings as a Service
    • Seminar & Talk
      • Berkeley System Seminar
        • RR: Engineering Record and Replay for Deployability
        • Immortal Threads: Multithreaded Event-driven Intermittent Computing on Ultra-Low-Power Microcontroll
      • Berkeley DB Seminar
        • TAOBench: An End-to-End Benchmark for Social Network Workloads
      • PS2
      • Sky Seminar Series
        • Spring 23
          • Next-Generation Optical Networks for Emerging ML Workloads
      • Reading List
        • Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks
        • Rearchitecting Linux Storage Stack for µs Latency and High Throughput
        • eBPF: rethinking the linux kernel
        • BPF for Storage: An Exokernel-Inspired Approach
        • High Velocity Kernel File Systems with Bento
        • Incremental Path Towards a Safe OS Kernel
        • Toward Reconfigurable Kernel Datapaths with Learned Optimizations
        • A Vision for Runtime Programmable Networks
        • The Demikernel and the future of kernal-bypass systems
        • Floem: A programming system for NIC-accelerated network applications
        • High Performance Data Center Operating Systems
        • Leveraging Service Meshes as a New Network Layer
        • Automatically Discovering Machine Learning Optimizations
        • Beyond Data and Model Parallelism for Deep Neural Networks
        • IOS: Inter-Operator Scheduler for CNN Acceleration
        • Building An Elastic Query Engine on Disaggregated Storage
        • Sundial: Fault-tolerant Clock Synchronization for Datacenters
        • MIND: In-Network Memory Management for Disaggregated Data Centers
        • Understanding host network stack overheads
        • From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers
        • Redesigning Storage Systems for Future Workloads Hardware and Performance Requirements
        • Are Machine Learning Cloud APIs Used Correctly?
        • Fault-tolerant and transactional stateful serverless workflows
      • Reading Groups
        • Network reading group
          • Recap
          • ML & Networking
            • Video Streaming
              • Overview
              • Reducto: On-Camera Filtering for Resource Efficient Real-Time Video Analytics
              • Learning in situ: a randomized experiment in video streaming
              • SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity
              • Neural Adaptive Video Streaming with Pensieve
              • Server-Driven Video Streaming for Deep Learning Inference
            • Congestion Control
              • ABC: A Simple Explicit Congestion Controller for Wireless Networks
              • TCP Congestion Control: A Systems Approach
                • Chapter 1: Introduction
              • A Deep Reinforcement Learning Perspective on Internet Congestion Control
              • Pantheon: the training ground for Internet congestion-control research
            • Other
              • On the Use of ML for Blackbox System Performance Prediction
              • Marauder: Synergized Caching and Prefetching for Low-Risk Mobile App Acceleration
              • Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation
              • Snicket: Query-Driven Distributed Tracing
            • Workshop
          • Homa: A Receiver-Driven Low-Latency Transport Protocol Using Network Priorities
        • DB reading group
          • CliqueMap: Productionizing an RMA-Based Distributed Caching System
          • Hash maps overview
          • Dark Silicon and the End of Multicore Scaling
        • WISR
          • pFabric: Minimal Near-Optimal Datacenter Transport
          • Scaling Distributed Machine Learning within-Network Aggregation
          • WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers
          • Data center TCP (DCTCP)
      • Wisconsin Seminar
        • Enabling Hyperscale Web Services
        • The Lottery Ticket Hypothesis
        • External Merge Sort for Top-K Queries: Eager input filtering guided by histograms
      • Stanford MLSys Seminar
        • Episode 17
        • Episode 18
  • Cloud Computing
    • Index
      • Cloud Reading Group
        • Owl: Scale and Flexibility in Distribution of Hot Contents
        • RubberBand: cloud-based hyperparameter tuning
  • Distributed System
    • Distributed Systems Lecture Series
      • 1.1 Introduction
  • Conference
    • Index
      • Stanford Graph Learning Workshop
        • Overview of Graph Representation Learning
      • NSDI 2022
      • OSDI 21
        • Graph Embeddings and Neural Networks
        • Data Management
        • Storage
        • Preview
        • Optimizations and Scheduling for ML
          • Oort: Efficient Federated Learning via Guided Participant Selection
          • PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
      • HotOS 21
        • FlexOS: Making OS Isolation Flexible
      • NSDI 21
        • Distributed System
          • Fault-Tolerant Replication with Pull-Based Consensus in MongoDB
          • Ownership: A Distributed Futures System for Fine-Grained Tasks
          • Caerus: NIMBLE Task Scheduling for Serverless Analytics
          • Ship Computer or Data? Why not both?
          • EPaxos Revisited
          • MilliSort and MilliQuery: Large-Scale Data-Intensive Computing in Milliseconds
        • TEGRA: Efficient Ad-Hoc Analytics on Evolving Graphs
        • GAIA: A System for Interactive Analysis on Distributed Graphs Using a High-Level Language
      • CIDR 21
        • Cerebro: A Layered Data Platform for Scalable Deep Learning
        • Magpie: Python at Speed and Scale using Cloud Backends
        • Lightweight Inspection of Data Preprocessingin Native Machine Learning Pipelines
        • Lakehouse: A New Generation of Open Platforms that UnifyData Warehousing and Advanced Analytics
      • MLSys 21
        • Chips and Compilers Symposium
        • Support sparse computations in ML
      • SOSP 21
        • SmartNic
          • LineFS: Efficient SmartNIC offload of a distributed file system with pipeline parallelism
          • Xenic: SmartNIC-accelerated distributed transacitions
        • Graphs
          • Mycelium: Large-Scale Distributed Graph Queries with Differential Privacy
          • dSpace: Composable Abstractions for Smart Spaces
        • Consistency
          • Efficient and Scalable Thread-Safety Violation Detection
          • Understanding and Detecting Software Upgrade Failures in Distributed Systems
        • NVM
          • HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM
        • Learning
          • Bladerunner: Stream Processing at Scale for a Live View of Backend Data Mutations at the Edge
          • Faster and Cheaper Serverless Computing on Harvested Resources
  • Random
    • Reading List
      • Random Thoughts
      • Hesse
      • Anxiety
  • Grad School
    • Index
      • Resources for undergraduate students
Powered by GitBook
On this page
  • Background
  • Problem
  • Main strength
  • Key weakness
  • When is it a good idea to run network functions in the cloud?
  • What's the role of middleboxes in the future of networking?
  • Other comments / thoughts
  • Enjoy?

Was this helpful?

  1. Networking
  2. Index
  3. CS 268 (Adv Network)
  4. NFV

Making Middleboxes Someone Else's Problem: Network Processing as a Cloud Service

PreviousPerformance Interfaces for Network FunctionsNextEthics

Last updated 2 years ago

Was this helpful?

Background

  • Middlebox is a computer networking device that transforms, inspects, filters, and manipulates traffic for purposes other than packet forwarding.

  • Examples of middleboxes include firewalls, network address translators (NATs), load balancers, and deep packet inspection (DPI) boxes.

  • Middleboxes offer valuable benefits, such as improved security (e.g. firewalls and insrusion detection systems), improved performance (e.g. proxies), and reduced bandwidth costs (e.g. WAN optimizers)

Problem

  • Today's enterprise networks rely on a wide spectrum of specialized appliances or middleboxes to improve security, performance, and reduce bandwidth costs.

  • Middleboxes are expensive, complex to manage, and creates new failure modes for the networks that use them, which result from their complex and specialized processing, variations in management tools across devices and vendors, and the need to consider policy interactions between these appliance and other network infrastructure.

Middleboxes today

  • Large-scale deployments, substantial cost (i.e. high up-front investment in hardware), complexity in management

Main Idea

  • This paper argues that middlebox processing can benefit from outsourcing the cloud, given the promise of cloud computing to decrease costs, east management, and provide elasticity and fault-tolerance.

  • Three challenges

    • 1) functional equivalence: what types of middleboxes can be outsourced and what enterprise-side functionality is needed to achieve such outsourcing?

    • 2) low complexity at the enterprise: want a cloud-based middlebox architecture that minimizes the complexity of this enterprise-side functionality

    • 3) low performance overhead: traffic is now sent on a detour through the cloud leading to a potential increase in packet latency and bandwidth consumption, we want a system design that minimizes this performance penalty

  • Design considerations

    • What is the effective complexity of the network architecture at the enterprise after outsourcing?

    • What redirection architecture is required to retain the functional equivalence and low latency operation?

      • Bounce redirection

        • simple configuration

        • but increase latency due to RTT to provider

      • IP redirection

        • avoid extra round-trips

        • but

          • in the multi-PoP scenario, IP-based redirection will break the semantics of stateful middleboxes

          • the enterprise / provider has little control over which PoP is selected

      • DNS redirection

        • avoids latency panelty, provides more control over redirection, easy to manage

        • but it introduce challenge in oursourcing traffic for legacy applications which provide external clients with IP addresses rather than DNS names

        • DNS + smart

          • Smart redirection

            • redirect traffic on a per-destination basis through the PoP that minimizes end-to-end latency

            • requires that the APLOMB appliance redirect traffic to different PoPs based on the client's IP and maintain persistent tunnels to multiple PoPs

            • >70% of cases have zero or negative inflation and 90% of all traffic has less than 10ms inflation

        • Some discussion over bandwidth consumption

          • APLOMB+: general-purpose traffic compression capabilities

    • What type of provider footprint is needed for low latency operation?

      • Multi-PoP provider (e.g. AWS datacenters / regions) v.s CDN providers (e.g. Akaimai)

      • Limited portion of US clients v.s Low latency service for a nation-wide client base

Architecture

  • APLOMB gateway: redirect enterprise traffic

    • Logically co-located with the enterprise's gateway router

    • Functions

      • Maintaining persistent tunnels to multiple cloud PoPs

      • Steering the outgoing traffic to appropriate PoP

  • Cloud provider

    • Tunnel endpoints: en encapsulate / decapsualte traffic from the enterprise

    • Middlebox instances: to process the customer's traffic

    • NAT devices: to translate between publically visible IP addresses and the client's internal addresses

    • Policy switching logic: to steer packets between the above components

  • Control plane: manage and configure components

    • Redirection optimization (PoP selection): push the current best tunnel selection strategies to the APLOMB gateway by using measurement data from the cloud PoPs

    • Middlebox scaling (adaptive scaling): detect changes in utilization using data from heartbeat health checks to automatically scale out or scale in

Main strength

  • The studies on on middlebox deployments across a range of enterprise scenarios are convincing! Real-world investigations & interviews identify what the exact the problems are in enterprises (e.g. high capital expenses and operating costs, complex management requirements, failure and overload scenarios). These studies provide solid motivations on the importance of problems that this paper tries to address.

  • The paper also offers a valuable discussion on systematic exploration of the requirements and design space for outsourcing middleboxes, including discussions on redirection, provider footprint, and location dependent services.

  • The paper presents comprehensive design, implementation, and evaluation sections of the proposed APLOMB architectures.

  • Additionally a discussion on the future hybrid enterprise/cloud architectures and related security challenges are presented, it's an interesting read.

Key weakness

  • Using APLOMB brings the same security questions as have challenged cloud computing, providing third-party cloud provider access to unencrypted data in order to process traffic flows. Companies whose security policies are restricted might not be able to use this kind of approach.

  • APLOMB reduces the cost of middlebox infrastructure, but it may increase bandwidth costs as tunneling traffic to a cloud provider means paying for bandwidth twice for the enterprise network's access link and at the cloud provider. Current pricing strategies, especially the volume-based pricing approach, are not well-suited for APLOMB especially for high-volume user.

When is it a good idea to run network functions in the cloud?

  • When it's good

    • Maybe small- to medium-enterprises which lacks a systematic management and monitoring system infrastructure or need to spend a lot in building such structure (i.e. upfront capital expenses, operational expenses). Cloud ease the burdens by providing a better pay-per-use models and reduce complexity of managing middleboxes and deal with failures.

  • When it's not good

    • Enterprises with security restrictions that are not suited for cloud computing in general

    • For enterprises which need to support applications with a very tight latency-SLOs, it might not be a good idea to route traffic to/from cloud as it might induce latency penalties

    • For enterprises which have high-volume users, transferring bulk data from- and to- cloud can be very expensive

    • For very big enterprises, the upfront capital investments in equipment and operating costs will be amortized, and maybe they can develop a application-specific high-performance appliance architecture that is cost-effective without going to the cloud

What's the role of middleboxes in the future of networking?

  • Are they still required as a separate entity given emerging technologies like SDN, NFV, microservices, data centers, among others?

  • Maybe indeed SDN, NFV, and / or microservices are changing the role of middleboxes not as a separate entity but more towards being integrated with the underlying infrastructure built with these types of technologies.

  • We can use SDN or NFV to provide and manage network services with a centralized controller same as what traditional Middleboxes used to provide. Decoupling the data and control plane provide greater flexibility compared to traditional middleboxes, which are tightly coupled with the underlying hardware architecture.

  • Not so sure about microservices, maybe some functionalities can be built directly without using separate middleboxes. For data centers with adoption of cloud computing, network functions like load balancing can be handled by software-defined load balancer that is integrated with the hypervisor or container platform.

  • Middleboxes as a separate entity can also be used in some cases, for example I can imagine some sets of specifically-tuned middleboxes deployed by enterprises for performance guarantees. Also traditional roles of middleboxes remain useful in the context of firewalls and intrusion detections.

Other comments / thoughts

  • The pricing model discussions are particularly interesting, during in-class discussions I'd like to see more of that

Enjoy?

  • Yes, in general a well-structured, well-motivated, and clearly-presented paper