# Offloading distributed applications onto smartNICs using iPipe

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCdhr_LP4TzKO6t9PQ%2Fimage.png?alt=media\&token=4c13938e-6d7f-40b4-bf7d-64a9033a6cff)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCdkbfb-8gg0UZ-3DQ%2Fimage.png?alt=media\&token=8a69ed93-ba78-433f-aed7-05b168223210)

* Accelerate general purpose distributed applications?

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCeFXh7ZBRRbiVCIkk%2Fimage.png?alt=media\&token=d5f72951-21d0-4515-ae8a-4fb3b510b576)

* DMA: writing to and from the host memory&#x20;
* Host communication&#x20;
  * Use DMA&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCeeJFBrVNMznwXZQn%2Fimage.png?alt=media\&token=1eaaa051-496b-4c20-9438-0cb5b0831580)

* Multi-core processors, low power&#x20;
  * Extent of compute is limited (to offload)&#x20;
* Network applications&#x20;
  * RDMA, DPDK&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCf33b0cGC6C34XnTx%2Fimage.png?alt=media\&token=2e53190a-a2d7-4fc6-89f6-46a63fa1377e)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCf5dxY0A_PNVAZhyd%2Fimage.png?alt=media\&token=4fed14c9-1ef6-489e-90ba-ebc5a2e72c85)

* Packet arrive at the TX/RX ports&#x20;
* Traffic manager&#x20;
  * Abstractions between the ports and actual ports&#x20;
* NIC cores handle all traffic on both the send and receive paths&#x20;
  * Might not have computation to do&#x20;
* Tight integration of computing and communication&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCg76C5OzA9-clW0gM%2Fimage.png?alt=media\&token=696a10e1-2dc2-4671-a54c-313f7ea1cfc3)

* Computation requires additional overhead&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCgBfpgX2aA3viOYEr%2Fimage.png?alt=media\&token=c1c64257-6e5e-4f35-ae73-d63551b275fd)

* Shows how many cores are needed to fully utilize the available bandwidth&#x20;
  * Max bandwidth = 10 Gbps&#x20;
* 256B: not sufficient to process that many buckets&#x20;
* Larger: saturated quickly&#x20;

Take-away

* Quantifies the default forwarding tax of smartNics&#x20;
* Dependent on packet size workload&#x20;

Workload?&#x20;

* No processing

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCirpDtHvh9Y87zT45%2Fimage.png?alt=media\&token=000bde66-f411-4a1d-9baf-c72339d202e0)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCiv6_ymj8qs1rBAhu%2Fimage.png?alt=media\&token=cb1b8a84-1fee-430d-a774-7f4984042297)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCj8IqP_X6bCTjaWS8%2Fimage.png?alt=media\&token=50da7208-622c-4661-938f-4f496057bf7a)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCjHVAuOZd6DYwKXTw%2Fimage.png?alt=media\&token=a14589f8-62a8-42a0-823f-26e140cb377b)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCjZLicuazkiVZ5I6Z%2Fimage.png?alt=media\&token=2e18362d-e7ae-4f22-9f2d-089ead8a9511)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCjnQNeGwpAcREmjbk%2Fimage.png?alt=media\&token=35a3ad0f-a0ed-4a11-8233-14f27f19dc51)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCdD01hZQb5ZwmKy-W%2F-MXCk4x6sBpM1vOQaxvP%2Fimage.png?alt=media\&token=cfd3e65c-8547-484f-bbc7-66ee16ed8bb9)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCk7OTk50DRWj24U5I%2F-MXClt5osInP3bDubnZ4%2Fimage.png?alt=media\&token=5701aa77-2c4b-4ee3-b074-969418e86c87)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCpNlvcd7vvPLnyiK4%2F-MXCp_XeFldnpw55qIHy%2Fimage.png?alt=media\&token=f25d5b5a-8b67-4933-ad7f-3527c43d580c)

* Actor: logic - workload has a lot of variants. That type of actor is going to move to DRR. But actor with more uniform distribution will be on FCFS cores.&#x20;
  * Tail latency? How they measure that?&#x20;
    * Window. So might not absolute, just relative.&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCpNlvcd7vvPLnyiK4%2F-MXCq1TFO_kJkLFoX_o0%2Fimage.png?alt=media\&token=58c3a107-4c0c-4fbb-ae2b-0059883504d9)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCk7OTk50DRWj24U5I%2F-MXCn3GYJ8nu1tkwr-1h%2Fimage.png?alt=media\&token=802fc0ae-9a18-4951-9e8d-4051834eaafe)

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MXCpNlvcd7vvPLnyiK4%2F-MXCqC71ei-WIVfhG5KA%2Fimage.png?alt=media\&token=92b1d24c-8375-4c32-89a5-f58a975184c0)
