IncBricks: Toward In-Network Computation with an In-Network Cache

Emergence of programmable network devices + increasing data traffic of data centers --> in-network computation
Offload compute operations to intermediate network devices
- Serve network request with low latency
- Reduce datacenter traffic + reduce congestion
- Save energy
Challenge:
- No general compute capabilities
- Commodity datacenter networks are complex
Key: in-network caching fabric with basic computing primitives

Goal: reduce traffic, lower communication latency, reduce communication overheads
SDN
- programmable switches (application-specific header parsing, customized match-action rules, light-weight programmable forwarding plane)
- network accelerators: low-power multicore processors and fast traffic managers
INC: offload a set of compute operations from end-servers onto programmable network devices (switches, network accelerators)
Challenges
- Limited compute power and little storage for DC computation
- Keeping computation and state coherent across networking elements is complex
- INC requires simple and general computing abstraction to be integrated with application logic
Propose: in-network caching fabric with basic computing primitives based on programmable network devices
- IncBox: hybrid switch/network accelerator architecture, offload application-level operations
- IncCache: in-network cache for KV store

Hierarchical topology
- ToR: 10 Gbps
- aggregation: 10-40 Gbps
- core switches: 100 Gbps
Multiple paths in the core of the network by adding redundant switches
Traditional Ethernet switches
- Packet: forward based on forwarding database (FDB)
  - Data plane: process network packets at line rate
    Ingress / Egress controller: match transmitted and received packets between their wire-level representation and a unified, structured internal format
    Packet memory: buffer in-flight packets across all ingress ports
    Switching module: makes packet forwarding decisions based on the forwarding database
  - Control plane: configure forwarding policies
    low-power processor for adding and removing forwarding rules

Programmable switch and network accelerator

Programmable switches: reconfigurability in forwarding plane
- Programmable parser, match memory, action engine
  - Packet formats customizable
  - Simple operations based on headers of incoming packets
Network accelerators
- Traffic manager: fast DMA between TX/RX ports and internal memory
- Packet scheduler: maintaining incoming packet order and distribute packets to cores
- Low-power multicore processor: payload modifications
- Con: only a few interface ports, limiting processing bandwidth

Combine two hardware devices

IncBox: hardware unit of a network accelerator co-located with Ethernet switch
- Packet (INC), switch forward to network accelerator for computation
IncCache: distributed, coherent KV store with computing capabilities --> packet parsing, hashtable lookup, command execution, packet encapsulation

Support three things
- F1: Parse in-transit network packets and extract some fields for the IncBrick logic
- F2: Modify both header and payload and forward the packet based on the hash of the key
- F3: Cache key / value data and potentially execute basic operations on ached value
- Should provide: P1 high throughput and P2 low latency
Programmable switches:
- can only support simple operations (read, write, add, subtract, shift on counters)
- size of the packet buffer is on the order of few tens of MB, most for storing incoming packet traffic and little space for caching
- Can meet F1 and F2, but hard to satisfy F3 and P1, P2 in terms of payload-related operations
Network accelerators to satisfy rest of the requirements
- Traffic manager can serve packet data faster than kernel bypass techniques
  - Kernel bypass: eliminates the overheads of in-kernel network stacks by moving protocol processing to user space
    E.x. dedicate NIC to application, or continue to manage NIC by allowing applications to map NIC queues to their address space
- Multi-core processors can saturate 40-100 Gbps bandwidth easily
- Support multi-GB of memory, which can be used for caching

Able to
- Cache data on both IncBox units and end-servers
- Keep the cache coherent using a directory-based cache coherence protocol
- Handle scenarios related to multipath routings and failures
- Provide basic compute primitives
Packet format: ID, magic field, command, hash, application payload
Hash table based data cache
- On both network accelerators and endhost servers
  - network accelerator: fixed size lock-free
  - endhost servers: extensible hash table, lock-free
  - Cache coherence protocol: keep data consistent without incurring high overhead
    Hierarchical directory-based cache coherence protocol
    Take advantage of the structured network topology by using a hierarchical distributed directory mechanism
    Decouple system interface and program interface to provide flexible programmability
    Support sequential consistency for high performance SET/GET/DEL requests

Last updated 3 years ago

Was this helpful?