Rethinking Networking Abstractions for Cloud Tenants

  • 88% use two or more cloud providers

  • 92% use both public / private cloud deployment

  • Architecture: workloads span multiple regions within the cloud, multiple clouds

    • Individual virtual networks

      • Addresses? ACLs? routes?

    • Connectivity in / out

      • Internet? NAT? VPN?

    • Connect multiple virtual networks

      • Across clouds or across cloud regions

      • Virtual network peering, ...

    • Dedicated connections

      • Availability and consistent performance: reserve a link between cloud data center and an internet exchange point

    • Appliances

      • Load balancers, firewalls

  • Private data centers

    • Physical network boxes: routers, firewalls, and load balancers

      • Managing this is difficult --> move to the cloud

    • Azure (similar abstraction): user-define router, firewall, load balancer

    • Aws: learned ins and outs of these kinds of boxes

  • We are not seeing higher-level abstractions (still dealing with low-level components when we manage our own infra)

  • Complex planning

  • Complex configuration

Current solutions:

  • Multi-cloud solutions: seek to give on management plane to this mess

    • But it doesn't solve the underlying complexity, but handle it to shim layer instead

Proposal:

  • Eliminating tenant networking layer all together!

  • Tenant goals: connectivity, availability, security, QoS

    • Provide a declarative API which allows tenants to specify these goals on a per endpoint basis abstracting away the networking details

  • Key idea: "publicly routable but default-off"

    • Routability vs. Reachability

    • Endpoint: publicly routable, default-off (traffic destined for that address will be dropped by cloud provider unless specified otherwise) --> per end-point permit list

  • Proposed API

Open questions

  • Simple or simplistic?

    • Feasibility?

      • Is it scalable for cloud provider to keep consistent and dynamic per endpoint permit list to each tenant

      • Security?

    • Adoption?

      • Proposal can exist in parallel to today's abstraction

      • Simpler, low-risk deployment

        • What tenants and workloads are the most likely early adopters for a new architecture such as we propose?

    • Other question: is it the right one, what alternative solutions we should be considering

      • Sufficiently high level that abstract details entirely?

Paper: https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s02-mcclure.pdf

Main contribution:

  • Right now the network virtualization as experienced by cloud tenants is overly complex

  • Propose: free cloud tenants entirely from having to build and operate virtual networks

  • Cloud networking exposed to tenants in a declarative and endpoint-centric manner

    • Associated SLOs with endjoints

    • Natural extension to what cloud providers already offer in compute and storage, and cloud providers can innovate below this interface without developing tenant-layer abstractions for every possible feature

Last updated