Offloading distributed applications onto smartNICs using iPipe

https://homes.cs.washington.edu/~arvind/papers/ipipe.pdf

  • Accelerate general purpose distributed applications?

  • DMA: writing to and from the host memory

  • Host communication

    • Use DMA

  • Multi-core processors, low power

    • Extent of compute is limited (to offload)

  • Network applications

    • RDMA, DPDK

  • Packet arrive at the TX/RX ports

  • Traffic manager

    • Abstractions between the ports and actual ports

  • NIC cores handle all traffic on both the send and receive paths

    • Might not have computation to do

  • Tight integration of computing and communication

  • Computation requires additional overhead

  • Shows how many cores are needed to fully utilize the available bandwidth

    • Max bandwidth = 10 Gbps

  • 256B: not sufficient to process that many buckets

  • Larger: saturated quickly

Take-away

  • Quantifies the default forwarding tax of smartNics

  • Dependent on packet size workload

Workload?

  • No processing

  • Actor: logic - workload has a lot of variants. That type of actor is going to move to DRR. But actor with more uniform distribution will be on FCFS cores.

    • Tail latency? How they measure that?

      • Window. So might not absolute, just relative.

Last updated