Snicket: Query-Driven Distributed Tracing

https://dl.acm.org/doi/10.1145/3484266.3487393

The rise of microservices
- Complexity and scaling --> microservices
- Advantages
  - Flexibility with languages
  - Development velocity
- Challenges
  - Distribution
Trace: a directed tree representing the calls made as a result of one user request
Head-Based Sampling
- Unusual data is not collected
Tail-based sampling
- May lose information of "how common"
Problems with current approaches
- Persists information at granularity of traces
  - Most queries of the data are interested only in the subset of the trace
- Queries run on incomplete data
  - Uniform sampling misses unusual traces
  - Trace data is biased by the sampling strategy
Key idea: collect only the information necessary for the query
What is Snicket?
- Query-driven
- Allows for complex queries that include graph-based reasoning
- Tightly ties collection and computation: collect only what you need
Why is Snicket possible now?
- The emergence of service proxies, analogous to application-level switches (e.x., envoy, linkerd)
- Rise of extensions with proxies that allow computation close to the source
  - Allows network programmability higher up the stack
Snicket Design Overview
- The Snicket Query Language
  - Structural filter: match on isomorphic subtree (Match)
  - Attribute filter: match on attributes of nodes or traces (Where)
  - Map: create new developer-defined attributes (latency)
  - Return: return attribute value
  - Aggregate: aggregate return values (avg)
Query example
Evaluation: expressiveness, interactivity, and latency
- Online Boutique: 10 microservices
- 7 nodes of e2-highmem-4
- Horizontal and vertical autoscaling enabled: 11->12 nodes
- Locust as load generator
- ~15ms difference

PreviousHorcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation NextWorkshop

Last updated 3 years ago

Was this helpful?