# AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network

### Presentation&#x20;

* CDN Caching Architecture&#x20;
  * Users, content providers, CDN is a layer in between (caching servers which store the popular objects)&#x20;
  * Two caching levels&#x20;
    * Disk Cache (DC): slower &#x20;
    * Hot Object Cache (HOC): in memory, faster&#x20;
      * HOC performance metric: object hit ratio = # of reqs served by HOC / tot # reqs&#x20;
      * Goal: maximize OHR&#x20;
* Prior approaches to cache management&#x20;
  * Frequent decisions required&#x20;
    * What to admit&#x20;
      * Admit all the requests into the cache&#x20;
    * What to evict: most of the prior work&#x20;
      * LRU&#x20;
      * Mixtures of LRU / LFU&#x20;
      * Concurrent LRU (boost caching system throughput)&#x20;
* We are missing a key issue&#x20;
  * Not all objects are the same (object size distribution)&#x20;
    * 9 orders of magnitude of variability&#x20;
    * Should we admit every object?&#x20;
      * HOC is small. Favor what? How to do it well?&#x20;
      * Academic works (equal sizes)&#x20;
* Size-aware admission&#x20;
  * Fixed size threshold: admit if size < Threshold c&#x20;
    * How to pick c? Pick c to maximize OHR&#x20;
      * The c is different at different times (morning, noon, evening)
      * The best threshold changes with the traffic mix&#x20;
  * Probabilistic admission
    * High admission probability for small objects, low admission probability for large objects&#x20;
    * But many curves, which curve makes big difference (adapt c)&#x20;
* The AdaptSize Caching System: focus on admission to the cache, and continuously adapts the parameter of size-aware admission&#x20;
  * Adapt with traffic, adapt with time&#x20;
  * Take traffic measurements --> calculate the best c --> enforce admission control&#x20;
  * Red, green, and blue curve in the evening&#x20;
    * Delta interval, fnid the best c within each delta interval&#x20;
    * Traditional approach&#x20;
      * Hill climbing: local optima on OHR-vs-c curve&#x20;
      * AdaptSize approach: Markov model to predict the curve, enables speedy global optimization&#x20;
  * How AdaptSize gets the OHR-vs-c curve
    * Markov chain&#x20;
      * Track IN/OUT for each object&#x20;
        * IN: hit, OUT: miss&#x20;
    * Algorithm&#x20;
      * For every delta interval and for every value of c
        * Use Markov chain to solve for OHR(c)
        * Find c to maximize OHR&#x20;
      * Why hasn't this been done? too slow with exponential state&#x20;
      * New technique: approximation with linear state space&#x20;
  * Implementing AdaptSize&#x20;
    * Incorporated into Varnish
      * Highly concurrent HOC system, 40+ Gbits / s&#x20;
      * Take traffic measurements --> calculate the best c --> enforce admission control&#x20;
    * Challenges&#x20;
      * 1\) Concurrent write conflicts: 40% requests concentrate on 1% object&#x20;
      * 2\) Locks too slow&#x20;
    * AdaptSize: producer/consumer + ring buffer, lock-free implementation&#x20;
  * AdaptSize: admission is very simple
    * Given c, and the object size&#x20;
    * Admit with P(c, size): probability&#x20;
    * Enables lock free & low overhead implementation&#x20;
* Testbed
  * Clients: replay Akamai requests trace&#x20;
  * HOC systems: 1.2 GB, 16 threads&#x20;
  * Evaluate&#x20;
    * unmodified Varnish: admit everything, evict based on concurrent LRU&#x20;
    * NGINX cache: frequency filter to admit, LRU to evict &#x20;
    * AdaptSize&#x20;
  * Also evaluate robustness of AdaptSize
    * Artificial traffic mix changes&#x20;
    * Compare&#x20;
      * Size-aware OPT: offline parameter tuning
      * AdaptSize: our Markovian tuning model&#x20;
      * HillClimb: local-search using shadow queues&#x20;

Conclusion:&#x20;

* Goal: maximize OHR of the Hot Object Cache
* Approach: size-based admission control
* Key insight: need to adapt parameter c&#x20;
* AdaptSize: adapts c via a Markov chain
* Result: 48-92% higher OHRs
* Paper: throughput, disk utilization, byte hit ratio, request latency&#x20;
