# Fault-tolerant and transactional stateful serverless workflows

### Presentation&#x20;

* Serverless&#x20;
  * AWS, Google cloud function, Microsoft Azure&#x20;
  * Database: DynamoDB, ...
  * API gateway for communication&#x20;
* Motivation: Workers can fail&#x20;
  * Receive error / timeout, should I retry?
    * Not sure whether crash or not before write&#x20;
  * Write idempotent functions&#x20;
    * Safely retry if previous invocation fails&#x20;
    * Stateless function always idempotent&#x20;
* Contribution: make stateful serverless functions idempotent automatically&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MksagJVgUmndi3TvgNL%2Fimage.png?alt=media\&token=f9d3cf69-4dc4-4670-98bc-597704a11151)

* Write: log atomically
  * If have done write (check the log), skip it
  * If crash between writing and logging, do the writing the second time&#x20;
  * Solution: collocate write log with the data&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MksbIC-168r6zjXG082%2Fimage.png?alt=media\&token=fccc4462-adfd-4fee-b050-3d5f99d39b3e)

* Challenge&#x20;

1. Limitation of databases&#x20;

* Can only hold limited data, need to do aggressive garbage collect
* Solution: spread the log for a given key across multiple rows&#x20;
* Solution: use scan and projection to download a skeleton version of Linked DAAL (lock-free data structure)&#x20;
  * 256 bits per row&#x20;
  * Quickly traverse to the tail&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MksblqYqXpbBqD3BN_s%2Fimage.png?alt=media\&token=35f46f06-433c-4f63-81d5-55bf85c403aa)

2\. Federated setup&#x20;

* Invocation with exactly-once semantics&#x20;
* Federated setup with serverless&#x20;
  * Each lambda: own garbage collector&#x20;
  * It's possible that before Lambda 1 gets resetted, the GC of lambda2 is triggered and clean up all the logs&#x20;
  * Later, write in Lambda2 will be repeated again&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MkscM_MY6SI1Zw1yyzY%2Fimage.png?alt=media\&token=5eb2546c-095b-4fca-84fe-05b0882224c5)

* Solution: call back&#x20;
  * Complication: API gateway&#x20;
  * Move the result from the callee side to caller side&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MkscmsStU9S5vHA5FoA%2Fimage.png?alt=media\&token=a32281dc-d36f-4132-8f05-8bf0a4ea234b)

3\. Transactions across multiple lambdas&#x20;

Evaluation&#x20;

1. costs of Beldi's API operations
   1. Operations: read, write, condWrite, invoke&#x20;
   2. 2-4x more expensive than the baseline&#x20;
2. How does it perform in real-world application&#x20;
   1. DeathStarBench (ASPLOS 19)
      1. Movie review service, travel reservation, social media site&#x20;
   2. < 400 req/s: 2x higher&#x20;
   3. < 700 req/s: 3.3x higher than baseline&#x20;
3. What is the effect of GC&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MksZwJSi_k_zOCbaczE%2F-MksdivWMwBGgOFSdrz7%2Fimage.png?alt=media\&token=fcf8ace9-1c5b-4b3d-a5ca-b0c1102de469)

### Paper

#### Introduction&#x20;

* Stateful serverless functions (SSFs) requires reasoning about the consistency and isolation semantics in the presence of concurrent requests and dealing with component failures&#x20;
  * Unique idiosyncrasies that make existing approaches a poor fit&#x20;
* Peculiarities&#x20;
  * Request routing is stateless&#x20;
    * State machine replication are hard to implement because follow-up message might be routed by the infrastructure to a different SSF instance from the one that processed a prior knowledge&#x20;
  * SSFs can be independent and have sovereignty over their own data&#x20;
    * Different organizations develop and deploy different SSFs&#x20;
  * SSF workflows (directed graphs of SSFs) can be complex and include cycles to express recursion and loops over SSFs&#x20;
    * Transactions must observe consistent state to avoid undefined behavior&#x20;
* Beldi: library and runtime system for building workflows of SSFs&#x20;
  * State: kept in low-latency db (e.x. DynamoDB)&#x20;
  * SSFs: clients of scalable fault-tolerent storage services rather than stateful services themselves&#x20;
  * Goal: guarantee exactly-once semantics to workflows, offer synchronization primitives to prevent concurrent clients from unsafely handling state&#x20;
  * Extend Olive (OSDI 16')
    * operations beyond storage accesses (i.e. SSFs invocation to each other)
    * data structure for unifying storage of application state and logs&#x20;
    * protocols that operate efficiently on this protocol&#x20;

#### Backgrounds and Goals&#x20;

* Serverless functions: workflows&#x20;
  * DAGs&#x20;
    * Functions can be multi-threaded or perform async invocation&#x20;
  * Step function: how to stich together different functions and their inputs and outputs, take care of scheduling and data movement, and user get an identifier to invoke it
  * Driver function: single function specified by developer that invokes other functions&#x20;
* Stateful serverless functions (SSFs)&#x20;
  * Work-around to persist data: store it in fault-tolerant low-latency databases&#x20;
  * Poorly with the way that existing platform handles failures&#x20;
  * Current recommendation: write SSFs that are idempotent to ensure safety&#x20;
* Requirements&#x20;
  * Exactly-once semantics&#x20;
  * SSF data sovereignty
  * SSF reusability&#x20;
  * Workflow transactions&#x20;
  * Deployable today&#x20;

#### Design&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-Mksdt3MEqbUAjMjRQM2%2F-Mkso4xaCtYMzaD4z2KR%2Fimage.png?alt=media\&token=cb920894-d8d6-4d02-9ed8-2dc689afc930)

(1) library: exposes API for invocations, db read / write, and transaction&#x20;

(2) set of db tables that store the SSF's state and logs for reads, writes and invocations&#x20;

(3) intent collector: serverless function that restarts any instance of the corresponding SSF that have stalled or crashed&#x20;

(4) garbage collector: serverless function that keeps the logs from growing unboundedly&#x20;

#### Executing and Logging Operations in Beldi&#x20;

* Linked DAAL
  * allows logs to exist on multiple rows (or atomicity scopes), with new rows being added as needed&#x20;
* Read
  * Read and log need not happen atomically&#x20;
* Write&#x20;
  * Update and logging must be done atomically&#x20;
  * Need to handle the case where SSFs are accessing and appending to the linked DAAL concurrently&#x20;

#### Garbage Collection

* Ensures that logs are pruned and the linked DAAL remains shallow with a GC that deletes old rows and log entries without blocking SSFs that are concurrently accessing the list&#x20;

#### Locks and Transactions&#x20;

* Locks with intent&#x20;
* Transactions: based on a variant of 2PL, with wait-die deadlock prevention and two-phase commit&#x20;
