> For the complete documentation index, see [llms.txt](https://sliu583.gitbook.io/blog/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://sliu583.gitbook.io/blog/specific-work/seminar-and-talk/fall-21-reading-list/fault-tolerant-and-transactional-stateful-serverless-workflows.md).

# Fault-tolerant and transactional stateful serverless workflows

### Presentation&#x20;

* Serverless&#x20;
  * AWS, Google cloud function, Microsoft Azure&#x20;
  * Database: DynamoDB, ...
  * API gateway for communication&#x20;
* Motivation: Workers can fail&#x20;
  * Receive error / timeout, should I retry?
    * Not sure whether crash or not before write&#x20;
  * Write idempotent functions&#x20;
    * Safely retry if previous invocation fails&#x20;
    * Stateless function always idempotent&#x20;
* Contribution: make stateful serverless functions idempotent automatically&#x20;

![](/files/-MksagJVgUmndi3TvgNL)

* Write: log atomically
  * If have done write (check the log), skip it
  * If crash between writing and logging, do the writing the second time&#x20;
  * Solution: collocate write log with the data&#x20;

![](/files/-MksbIC-168r6zjXG082)

* Challenge&#x20;

1. Limitation of databases&#x20;

* Can only hold limited data, need to do aggressive garbage collect
* Solution: spread the log for a given key across multiple rows&#x20;
* Solution: use scan and projection to download a skeleton version of Linked DAAL (lock-free data structure)&#x20;
  * 256 bits per row&#x20;
  * Quickly traverse to the tail&#x20;

![](/files/-MksblqYqXpbBqD3BN_s)

2\. Federated setup&#x20;

* Invocation with exactly-once semantics&#x20;
* Federated setup with serverless&#x20;
  * Each lambda: own garbage collector&#x20;
  * It's possible that before Lambda 1 gets resetted, the GC of lambda2 is triggered and clean up all the logs&#x20;
  * Later, write in Lambda2 will be repeated again&#x20;

![](/files/-MkscM_MY6SI1Zw1yyzY)

* Solution: call back&#x20;
  * Complication: API gateway&#x20;
  * Move the result from the callee side to caller side&#x20;

![](/files/-MkscmsStU9S5vHA5FoA)

3\. Transactions across multiple lambdas&#x20;

Evaluation&#x20;

1. costs of Beldi's API operations
   1. Operations: read, write, condWrite, invoke&#x20;
   2. 2-4x more expensive than the baseline&#x20;
2. How does it perform in real-world application&#x20;
   1. DeathStarBench (ASPLOS 19)
      1. Movie review service, travel reservation, social media site&#x20;
   2. < 400 req/s: 2x higher&#x20;
   3. < 700 req/s: 3.3x higher than baseline&#x20;
3. What is the effect of GC&#x20;

![](/files/-MksdivWMwBGgOFSdrz7)

### Paper

#### Introduction&#x20;

* Stateful serverless functions (SSFs) requires reasoning about the consistency and isolation semantics in the presence of concurrent requests and dealing with component failures&#x20;
  * Unique idiosyncrasies that make existing approaches a poor fit&#x20;
* Peculiarities&#x20;
  * Request routing is stateless&#x20;
    * State machine replication are hard to implement because follow-up message might be routed by the infrastructure to a different SSF instance from the one that processed a prior knowledge&#x20;
  * SSFs can be independent and have sovereignty over their own data&#x20;
    * Different organizations develop and deploy different SSFs&#x20;
  * SSF workflows (directed graphs of SSFs) can be complex and include cycles to express recursion and loops over SSFs&#x20;
    * Transactions must observe consistent state to avoid undefined behavior&#x20;
* Beldi: library and runtime system for building workflows of SSFs&#x20;
  * State: kept in low-latency db (e.x. DynamoDB)&#x20;
  * SSFs: clients of scalable fault-tolerent storage services rather than stateful services themselves&#x20;
  * Goal: guarantee exactly-once semantics to workflows, offer synchronization primitives to prevent concurrent clients from unsafely handling state&#x20;
  * Extend Olive (OSDI 16')
    * operations beyond storage accesses (i.e. SSFs invocation to each other)
    * data structure for unifying storage of application state and logs&#x20;
    * protocols that operate efficiently on this protocol&#x20;

#### Backgrounds and Goals&#x20;

* Serverless functions: workflows&#x20;
  * DAGs&#x20;
    * Functions can be multi-threaded or perform async invocation&#x20;
  * Step function: how to stich together different functions and their inputs and outputs, take care of scheduling and data movement, and user get an identifier to invoke it
  * Driver function: single function specified by developer that invokes other functions&#x20;
* Stateful serverless functions (SSFs)&#x20;
  * Work-around to persist data: store it in fault-tolerant low-latency databases&#x20;
  * Poorly with the way that existing platform handles failures&#x20;
  * Current recommendation: write SSFs that are idempotent to ensure safety&#x20;
* Requirements&#x20;
  * Exactly-once semantics&#x20;
  * SSF data sovereignty
  * SSF reusability&#x20;
  * Workflow transactions&#x20;
  * Deployable today&#x20;

#### Design&#x20;

![](/files/-Mkso4xaCtYMzaD4z2KR)

(1) library: exposes API for invocations, db read / write, and transaction&#x20;

(2) set of db tables that store the SSF's state and logs for reads, writes and invocations&#x20;

(3) intent collector: serverless function that restarts any instance of the corresponding SSF that have stalled or crashed&#x20;

(4) garbage collector: serverless function that keeps the logs from growing unboundedly&#x20;

#### Executing and Logging Operations in Beldi&#x20;

* Linked DAAL
  * allows logs to exist on multiple rows (or atomicity scopes), with new rows being added as needed&#x20;
* Read
  * Read and log need not happen atomically&#x20;
* Write&#x20;
  * Update and logging must be done atomically&#x20;
  * Need to handle the case where SSFs are accessing and appending to the linked DAAL concurrently&#x20;

#### Garbage Collection

* Ensures that logs are pruned and the linked DAAL remains shallow with a GC that deletes old rows and log entries without blocking SSFs that are concurrently accessing the list&#x20;

#### Locks and Transactions&#x20;

* Locks with intent&#x20;
* Transactions: based on a variant of 2PL, with wait-die deadlock prevention and two-phase commit&#x20;
