# Dagger: Efficient and Fast RPCs in Cloud Microservices in Near-Memory Reconfigurable NICs

### Presentation&#x20;

* Trends in cloud computing (monoliths)&#x20;
  * Tightly-coupled application logic in a single statically / dynamically linked library&#x20;
  * Shift towards microservices&#x20;
    * Loosely-coupled application logic split into many independent small applications
  * Shift towards serverless&#x20;
    * Fine application granularity
    * Fine lifetime granularity&#x20;
* Cloud applications today are interactive&#x20;
  * Frequent interaction with large sets of users&#x20;
  * Strict performance requirements as SLO
    * Low tail latency under high load
    * Performance predictability&#x20;
* Focus on: improve communication stack in microservices&#x20;
  * over RPCs&#x20;
  * RPC requests in microservices are small and vary by tiers&#x20;
  * Take
    * Per-request communication overheads are large
    * Cannot tune communication stacks for small messages only
    * Need an adaptive stack&#x20;
  * RPC stacks run on the same CPUs as highly concurrent applications&#x20;
    * Already high pressure on CPUs from applications
    * Intensive traffic of small messages&#x20;
* Dagger: a HW/SW co-designed end-host RPC stack&#x20;
  * Design principles&#x20;
    * **Hardware offload**&#x20;
      * Existing techniques to improve efficiency of cloud networking&#x20;
        * Kernel bypass: IX, eRPC, mTCP, and many others&#x20;
          * Removes per-packet kernel overheads, tightly couples networking stacks with applications, but still run everything in SW&#x20;
        * RDMA system&#x20;
          * Offloads networking stacks to hardware&#x20;
          * But:&#x20;
            * only provides low-level abstractions, the RPC part runs in SW
            * requires specialized adapters&#x20;
      * hardware NIC for end-host communication stacks, from the L1 (PHY) layer, and all the way up to the application (RPC) layer&#x20;
        * Completely free CPU from doing any work related to data exchange
    * **Reconfigurability**&#x20;
      * Networking protocols, load balances, threading, data representation, data manipulation. HW should also be&#x20;
      * Dagger is based on an FPGA!&#x20;
        * Configurable transport: UDP, TCP, mTCP, HOMA, TONIC&#x20;
        * Configurable load balancer / flow controller: static, round-robin, random, application-specific&#x20;
        * Configurable host-NIC interface: PCIe doorbells, PCIe MMIOs, coherency-based&#x20;
        * Configurable threading model: connection/thread/queue/flow mapping, number of NIC flows / queues&#x20;
    * **Tight coupling**&#x20;
      * Dagger is based on a cache-coherent FPGA tightly-coupled with the host CPU
        * Inspired by soNUMA, series of RDMA studies&#x20;
        * an FPGA acting as NUMA node
          * No DMAs are required to exchange data between NUMA nodes
          * No explicit MMIO requests
          * Minimal software overhead
          * NUMA interconnects have lower latency&#x20;
      * Existing SmartNICs are based on PICe! (introduce overheads)&#x20;
        * Doorbell scheme&#x20;
          * Multiple PCIe roundtrips
          * Expensive and CPU-inefficient rings based on MMIOs&#x20;
        * Existing optimizations: combined descriptors and packets, packet write with MMIOs, doorbell batching... (but fail to eliminate)&#x20;

![](/files/HU1t9IenXN43syLE0PoK)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/networking/index/reading-list/dagger-efficient-and-fast-rpcs-in-cloud-microservices-in-near-memory-reconfigurable-nics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
