> For the complete documentation index, see [llms.txt](https://sliu583.gitbook.io/blog/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://sliu583.gitbook.io/blog/specific-work/seminar-and-talk/reading-groups/network-reading-group/ml-and-networking/other/snicket-query-driven-distributed-tracing.md).

# Snicket: Query-Driven Distributed Tracing

* The rise of microservices&#x20;
  * Complexity and scaling --> microservices&#x20;
  * Advantages&#x20;
    * Flexibility with languages
    * Development velocity&#x20;
  * Challenges&#x20;
    * Distribution&#x20;
* Trace: a directed tree representing the calls made as a result of one user request&#x20;
  * ![](/files/82yEz8HD2EaBLAfmeKr7)
* Head-Based Sampling&#x20;
  * ![](/files/9jWsL0ZveZUaecywSuDP)
  * Unusual data is not collected&#x20;
* Tail-based sampling&#x20;
  * ![](/files/EacjAKsiVmd4JatpypKr)
  * May lose information of "how common"
* Problems with current approaches&#x20;
  * Persists information at granularity of traces&#x20;
    * Most queries of the data are interested only in the subset of the trace&#x20;
  * Queries run on incomplete data&#x20;
    * Uniform sampling misses unusual traces
    * Trace data is biased by the sampling strategy&#x20;
* Key idea: collect only the information necessary for the query&#x20;
* **What is Snicket?**
  * Query-driven
  * Allows for complex queries that include graph-based reasoning&#x20;
  * Tightly ties collection and computation: collect only what you need&#x20;
* **Why is Snicket possible now?**
  * The emergence of service proxies, analogous to application-level switches (e.x., envoy, linkerd)
  * Rise of extensions with proxies that allow computation close to the source&#x20;
    * Allows network programmability higher up the stack&#x20;
* **Snicket Design Overview**&#x20;
  * ![](/files/jtvzf71PIpOuYffb0YoU)
  * The Snicket Query Language&#x20;
    * Structural filter: match on isomorphic subtree (Match)
    * Attribute filter: match on attributes of nodes or traces (Where)&#x20;
    * Map: create new developer-defined attributes (latency)
    * Return: return attribute value&#x20;
    * Aggregate: aggregate return values (avg)
* Query example
  * &#x20;![](/files/6rIszKXrbhS1RJ5DfPsJ)
* Evaluation: expressiveness, interactivity, and latency&#x20;
  * Online Boutique: 10 microservices&#x20;
  * 7 nodes of e2-highmem-4
  * Horizontal and vertical autoscaling enabled: 11->12 nodes
  * Locust as load generator&#x20;
  * \~15ms difference&#x20;
