Making Middleboxes Someone Else's Problem: Network Processing as a Cloud Service

Background

  • Middlebox is a computer networking device that transforms, inspects, filters, and manipulates traffic for purposes other than packet forwarding.

  • Examples of middleboxes include firewalls, network address translators (NATs), load balancers, and deep packet inspection (DPI) boxes.

  • Middleboxes offer valuable benefits, such as improved security (e.g. firewalls and insrusion detection systems), improved performance (e.g. proxies), and reduced bandwidth costs (e.g. WAN optimizers)

Problem

  • Today's enterprise networks rely on a wide spectrum of specialized appliances or middleboxes to improve security, performance, and reduce bandwidth costs.

  • Middleboxes are expensive, complex to manage, and creates new failure modes for the networks that use them, which result from their complex and specialized processing, variations in management tools across devices and vendors, and the need to consider policy interactions between these appliance and other network infrastructure.

Middleboxes today

  • Large-scale deployments, substantial cost (i.e. high up-front investment in hardware), complexity in management

Main Idea

  • This paper argues that middlebox processing can benefit from outsourcing the cloud, given the promise of cloud computing to decrease costs, east management, and provide elasticity and fault-tolerance.

  • Three challenges

    • 1) functional equivalence: what types of middleboxes can be outsourced and what enterprise-side functionality is needed to achieve such outsourcing?

    • 2) low complexity at the enterprise: want a cloud-based middlebox architecture that minimizes the complexity of this enterprise-side functionality

    • 3) low performance overhead: traffic is now sent on a detour through the cloud leading to a potential increase in packet latency and bandwidth consumption, we want a system design that minimizes this performance penalty

  • Design considerations

    • What is the effective complexity of the network architecture at the enterprise after outsourcing?

    • What redirection architecture is required to retain the functional equivalence and low latency operation?

      • Bounce redirection

        • simple configuration

        • but increase latency due to RTT to provider

      • IP redirection

        • avoid extra round-trips

        • but

          • in the multi-PoP scenario, IP-based redirection will break the semantics of stateful middleboxes

          • the enterprise / provider has little control over which PoP is selected

      • DNS redirection

        • avoids latency panelty, provides more control over redirection, easy to manage

        • but it introduce challenge in oursourcing traffic for legacy applications which provide external clients with IP addresses rather than DNS names

        • DNS + smart

          • Smart redirection

            • redirect traffic on a per-destination basis through the PoP that minimizes end-to-end latency

            • requires that the APLOMB appliance redirect traffic to different PoPs based on the client's IP and maintain persistent tunnels to multiple PoPs

            • >70% of cases have zero or negative inflation and 90% of all traffic has less than 10ms inflation

        • Some discussion over bandwidth consumption

          • APLOMB+: general-purpose traffic compression capabilities

    • What type of provider footprint is needed for low latency operation?

      • Multi-PoP provider (e.g. AWS datacenters / regions) v.s CDN providers (e.g. Akaimai)

      • Limited portion of US clients v.s Low latency service for a nation-wide client base

Architecture

  • APLOMB gateway: redirect enterprise traffic

    • Logically co-located with the enterprise's gateway router

    • Functions

      • Maintaining persistent tunnels to multiple cloud PoPs

      • Steering the outgoing traffic to appropriate PoP

  • Cloud provider

    • Tunnel endpoints: en encapsulate / decapsualte traffic from the enterprise

    • Middlebox instances: to process the customer's traffic

    • NAT devices: to translate between publically visible IP addresses and the client's internal addresses

    • Policy switching logic: to steer packets between the above components

  • Control plane: manage and configure components

    • Redirection optimization (PoP selection): push the current best tunnel selection strategies to the APLOMB gateway by using measurement data from the cloud PoPs

    • Middlebox scaling (adaptive scaling): detect changes in utilization using data from heartbeat health checks to automatically scale out or scale in

Main strength

  • The studies on on middlebox deployments across a range of enterprise scenarios are convincing! Real-world investigations & interviews identify what the exact the problems are in enterprises (e.g. high capital expenses and operating costs, complex management requirements, failure and overload scenarios). These studies provide solid motivations on the importance of problems that this paper tries to address.

  • The paper also offers a valuable discussion on systematic exploration of the requirements and design space for outsourcing middleboxes, including discussions on redirection, provider footprint, and location dependent services.

  • The paper presents comprehensive design, implementation, and evaluation sections of the proposed APLOMB architectures.

  • Additionally a discussion on the future hybrid enterprise/cloud architectures and related security challenges are presented, it's an interesting read.

Key weakness

  • Using APLOMB brings the same security questions as have challenged cloud computing, providing third-party cloud provider access to unencrypted data in order to process traffic flows. Companies whose security policies are restricted might not be able to use this kind of approach.

  • APLOMB reduces the cost of middlebox infrastructure, but it may increase bandwidth costs as tunneling traffic to a cloud provider means paying for bandwidth twice for the enterprise network's access link and at the cloud provider. Current pricing strategies, especially the volume-based pricing approach, are not well-suited for APLOMB especially for high-volume user.

When is it a good idea to run network functions in the cloud?

  • When it's good

    • Maybe small- to medium-enterprises which lacks a systematic management and monitoring system infrastructure or need to spend a lot in building such structure (i.e. upfront capital expenses, operational expenses). Cloud ease the burdens by providing a better pay-per-use models and reduce complexity of managing middleboxes and deal with failures.

  • When it's not good

    • Enterprises with security restrictions that are not suited for cloud computing in general

    • For enterprises which need to support applications with a very tight latency-SLOs, it might not be a good idea to route traffic to/from cloud as it might induce latency penalties

    • For enterprises which have high-volume users, transferring bulk data from- and to- cloud can be very expensive

    • For very big enterprises, the upfront capital investments in equipment and operating costs will be amortized, and maybe they can develop a application-specific high-performance appliance architecture that is cost-effective without going to the cloud

What's the role of middleboxes in the future of networking?

  • Are they still required as a separate entity given emerging technologies like SDN, NFV, microservices, data centers, among others?

  • Maybe indeed SDN, NFV, and / or microservices are changing the role of middleboxes not as a separate entity but more towards being integrated with the underlying infrastructure built with these types of technologies.

  • We can use SDN or NFV to provide and manage network services with a centralized controller same as what traditional Middleboxes used to provide. Decoupling the data and control plane provide greater flexibility compared to traditional middleboxes, which are tightly coupled with the underlying hardware architecture.

  • Not so sure about microservices, maybe some functionalities can be built directly without using separate middleboxes. For data centers with adoption of cloud computing, network functions like load balancing can be handled by software-defined load balancer that is integrated with the hypervisor or container platform.

  • Middleboxes as a separate entity can also be used in some cases, for example I can imagine some sets of specifically-tuned middleboxes deployed by enterprises for performance guarantees. Also traditional roles of middleboxes remain useful in the context of firewalls and intrusion detections.

Other comments / thoughts

  • The pricing model discussions are particularly interesting, during in-class discussions I'd like to see more of that

Enjoy?

  • Yes, in general a well-structured, well-motivated, and clearly-presented paper

Last updated