# AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network

### Presentation&#x20;

* CDN Caching Architecture&#x20;
  * Users, content providers, CDN is a layer in between (caching servers which store the popular objects)&#x20;
  * Two caching levels&#x20;
    * Disk Cache (DC): slower &#x20;
    * Hot Object Cache (HOC): in memory, faster&#x20;
      * HOC performance metric: object hit ratio = # of reqs served by HOC / tot # reqs&#x20;
      * Goal: maximize OHR&#x20;
* Prior approaches to cache management&#x20;
  * Frequent decisions required&#x20;
    * What to admit&#x20;
      * Admit all the requests into the cache&#x20;
    * What to evict: most of the prior work&#x20;
      * LRU&#x20;
      * Mixtures of LRU / LFU&#x20;
      * Concurrent LRU (boost caching system throughput)&#x20;
* We are missing a key issue&#x20;
  * Not all objects are the same (object size distribution)&#x20;
    * 9 orders of magnitude of variability&#x20;
    * Should we admit every object?&#x20;
      * HOC is small. Favor what? How to do it well?&#x20;
      * Academic works (equal sizes)&#x20;
* Size-aware admission&#x20;
  * Fixed size threshold: admit if size < Threshold c&#x20;
    * How to pick c? Pick c to maximize OHR&#x20;
      * The c is different at different times (morning, noon, evening)
      * The best threshold changes with the traffic mix&#x20;
  * Probabilistic admission
    * High admission probability for small objects, low admission probability for large objects&#x20;
    * But many curves, which curve makes big difference (adapt c)&#x20;
* The AdaptSize Caching System: focus on admission to the cache, and continuously adapts the parameter of size-aware admission&#x20;
  * Adapt with traffic, adapt with time&#x20;
  * Take traffic measurements --> calculate the best c --> enforce admission control&#x20;
  * Red, green, and blue curve in the evening&#x20;
    * Delta interval, fnid the best c within each delta interval&#x20;
    * Traditional approach&#x20;
      * Hill climbing: local optima on OHR-vs-c curve&#x20;
      * AdaptSize approach: Markov model to predict the curve, enables speedy global optimization&#x20;
  * How AdaptSize gets the OHR-vs-c curve
    * Markov chain&#x20;
      * Track IN/OUT for each object&#x20;
        * IN: hit, OUT: miss&#x20;
    * Algorithm&#x20;
      * For every delta interval and for every value of c
        * Use Markov chain to solve for OHR(c)
        * Find c to maximize OHR&#x20;
      * Why hasn't this been done? too slow with exponential state&#x20;
      * New technique: approximation with linear state space&#x20;
  * Implementing AdaptSize&#x20;
    * Incorporated into Varnish
      * Highly concurrent HOC system, 40+ Gbits / s&#x20;
      * Take traffic measurements --> calculate the best c --> enforce admission control&#x20;
    * Challenges&#x20;
      * 1\) Concurrent write conflicts: 40% requests concentrate on 1% object&#x20;
      * 2\) Locks too slow&#x20;
    * AdaptSize: producer/consumer + ring buffer, lock-free implementation&#x20;
  * AdaptSize: admission is very simple
    * Given c, and the object size&#x20;
    * Admit with P(c, size): probability&#x20;
    * Enables lock free & low overhead implementation&#x20;
* Testbed
  * Clients: replay Akamai requests trace&#x20;
  * HOC systems: 1.2 GB, 16 threads&#x20;
  * Evaluate&#x20;
    * unmodified Varnish: admit everything, evict based on concurrent LRU&#x20;
    * NGINX cache: frequency filter to admit, LRU to evict &#x20;
    * AdaptSize&#x20;
  * Also evaluate robustness of AdaptSize
    * Artificial traffic mix changes&#x20;
    * Compare&#x20;
      * Size-aware OPT: offline parameter tuning
      * AdaptSize: our Markovian tuning model&#x20;
      * HillClimb: local-search using shadow queues&#x20;

Conclusion:&#x20;

* Goal: maximize OHR of the Hot Object Cache
* Approach: size-based admission control
* Key insight: need to adapt parameter c&#x20;
* AdaptSize: adapts c via a Markov chain
* Result: 48-92% higher OHRs
* Paper: throughput, disk utilization, byte hit ratio, request latency&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/specific-work/wisr-group/cache/index/adaptsize-orchestrating-the-hot-object-memory-cache-in-a-content-delivery-network.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
