# Xenic: SmartNIC-accelerated distributed transacitions

### Presentation&#x20;

* Distributed transactions in the datacenter&#x20;
  * Our target: distributed ACID transactions are a replicated, in-memory database
  * Common approach: optimistic concurrency control + replication&#x20;
  * Viability depends on efficient remote operations --> hardware acceleration
* &#x20; Recent work applies RDMA&#x20;
  * One-sided read/write primitives are high-performance, but restrict design&#x20;
    * Impact data structure and protocol overheads&#x20;
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2F0GZlzafhaXhcIQoUfbXJ%2Fimage.png?alt=media\&token=cb162401-8e83-4a25-846f-cc68170aca48)
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FmpFS3BmsZGvYp2b6LR0j%2Fimage.png?alt=media\&token=c30a050d-370f-4fe1-a8e0-f17a83d64a68)
  * Two-sided RPCs are costly&#x20;
    * Add latency overhead, processing costs&#x20;
  * FaRM: one-sided RDMA
  * FaSST: two-sided RPCs&#x20;
  * DrTM+H: uses both &#x20;
  * Ongoing debate of applying RDMA: trade-offs are necessary&#x20;
* On-path SmartNICs: another option for hardware acceleration&#x20;
  * Programmable remote operations, without host processing&#x20;
  * Cost-effective compute: \~30% of NIC die area, 25W line-rate processing&#x20;
* SmartNIC opportunities&#x20;
  * Flexible CPU-bypass remote operations
  * Latency savings via stateful NIC operations, efficient PCIe DMA&#x20;
  * Efficient NIC-to-NIC communication&#x20;
  * But&#x20;
    * Software packet pipeline --> latency overhead&#x20;
    * Limited NIC resources&#x20;
* Xenic&#x20;
  * Distributed transactions accelerated with on-path SmartNICs&#x20;
  * Key
    * Co-designed data store, spread across NIC + host DRAM&#x20;
      * Minimize lookup overhead, utilizing NIC's on-board memory&#x20;
    * SmartNIC function shipping
      * Offload transaction logic to avoid PCIe crossings&#x20;
    * Multi-hop OCC protocols&#x20;
      * Reduce communication with optimized message patterns
    * &#x20;Stateful, asynchronous SmartNIC operation framework&#x20;
      * Exploit the SmartNIC's hardware interfaces&#x20;
* Xenic: Robinhood Data Store&#x20;
  * Host DRAM contains all objects; SmartNIC caches objects and lookup hints
  * Critical path accesses: NIC memory hit or DMA read, DMA log write&#x20;
    * Lookup hints limit DMA cost for cache misses&#x20;
      * Cache miss: bounded DMA R
      * Cache hit: NIC DRAM&#x20;
    * OCC + pinning ensure NIC/host consistency&#x20;
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FHtFx4IrK20TT1s6aZ4sJ%2Fimage.png?alt=media\&token=078fb346-014a-4f32-9e52-2de76bcc566c)
* Xenic: SmartNIC function shipping&#x20;
  * Provides SmartNIC cores as a function shipping target&#x20;
  * Shipping execution can reduce overhead, depending on application-level computation and state requirements&#x20;
  * Saves coordinator PCIe crossings&#x20;
* Xenic: multi-hop OCC protocols&#x20;
  * Ships execution to remote SmartNICs
  * Multi-hop NIC-to-NIC communication increases network efficiency&#x20;
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FJlWw1KqU8uf6qWtcASLm%2Fimage.png?alt=media\&token=39d1005f-5133-4aba-bb26-da01dc1a812c)
* Evaluation&#x20;
  * Robinhood + NIC lookup hints effectively reduce cost
  * SmartNIC increases DMA lookup efficiency, even for cache misses
  * For FaRM and DrTM+H, end-to-end bandwidth/latency cost&#x20;
  * Better latency & throughput than RPC, RDMA, hybrid designs&#x20;
  * Also measured in our paper&#x20;
    * Cumulative core savings&#x20;
    * Full TPC-C, Retwis, Smallbank&#x20;
* Summary: high-performance, CPU-efficient distributed transactions&#x20;
  * Leveraging on-path SmartNICs:
    * Avoids RDMA compromises&#x20;
    * Provides a new, remote access-optimized data store
    * Selectively offloads transaction logic&#x20;
    * Applies multi-hop communication patterns&#x20;
    * Delivers >2x throughput over 100Gbps RDMA, latency savings relative to RPCs&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/conference/index/sosp-21/smartnic/xenic-smartnic-accelerated-distributed-transacitions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
