LHD: Improving Cache Hit Rate by Maximizing Hit Density
https://www.usenix.org/conference/nsdi18/presentation/beckmann
Presentation
Key-value cache is 100x faster than database
Crucial in modern web-application, web-server --> cache (sit at a separate cluster, DRAM, MemCache), faster than underlying database
At 98% cache hit rate
+1% hit rate --> 35% speedup
Even small hit rate improvements cause significant speedup
Choosing the right eviction policy is hard
Key-value caches have unique challenges
Variable object sizes
Variable workloads
Prior policies are heuristics that combine regency and frequency
No theoretical foundation
Require hand-tuning --> fragile to workload changes
No policy works for all workloads
Prior system simulates many cache policy configurations to find right one per workload
Goal: Auto-tuning eviction policy across workloads
The "big picture" of key-value caching
Goal: maximizing cache hit rate
Constraint: limited cache space
Uncertainty: in practice, don't know what is accessed again
Difficulty: variable sizes
Where does cache space go?
Hit density (HD)
Hit density combines hit probability and expected cost
Q: Don't understand, if the cost incorporates the size, why isn't that being considered as the final objective that we want to achieve?
object's hit probability / (object size * object's expected lifetime)
Least hit density (LHD) policy: Trying to evict object with smallest hit density
But how do we predict these quantities?
Estimating hit density (HD)
Age - # accesses since object was last requested
Random variables
H - hit age: P [H = 100] = probability an object hits after 100 accesses (101 access)
L - life time: P[L = 100] = probability an object hits or is evicted after 100 accesses
Easy to estimate HD from these quantities
Hit probability
Expected lifetime
Example: estimating HD from object age
Estimate HD using conditional probability
Monitor distribution of H & L online
By definition, object of age a wasn't requested at age <= a
Ignore all events before a
LHD by example
Users ask repeated for common objects and some user-specific objects
Best hand-tuned policy for this app: cache common media + as much user-specific as fits
Probability of referencing object again
Common object modeled as scan, user-specific object modeled as Zipf
Guard the objects up to the peak (slightly older), and after the peak, eviction policy to perform somethings like LRU
Improving LHD using additional object features
Conditional probability lets us easily add information
Condition H & L upon additional informative object features, e.g.,
Which app requested this object?
How long has this object taken to hit in the past?
Features inform decisions --> LHD learns the "right" policy
No hard-code heuristics
LHD gets more hits across many traces, and needs much less space
RANKCACHE: translate theory to practice
The problem
Prior complex policies require complex data structures
Synchronization --> poor scalability --> unacceptable request throughput
Policies like GDSF require O(log N) heaps
Even O(1) LRU sometimes too slow because of synchronization
Many key-value systems approximate LRU with CLOCK / FIFO
Can LHD achieve similar request throughput to production system?
RankCache makes LHD fast
Track information approximately
Precompute HD as table indexed by age & app id & etc
Randomly sample objects to find victim
Similar to Redis, Memshare
Tolerate rare races in eviction policy
Related work
Using conditional probabilities for eviction policies in CPU caches
EVA
Fixed object sizes
Different ranking function
Prior replacement policies
Key-value: Hyperbolic, simulations, AdaptSize, Cliffhanger
Non key-value: ARC, SLRU, LRU-K
Heuristic based
Require tuning or simulation
Future directions
Dynamic latency / bandwidth optimization
Smoothly and dynamically switch between optimized hit ratio and byte hit ratio
Optimizing end-to-end response latency
App touches multiple objects per request
One such object evicted --> others should be evicted too
Modeling cost, e.g., to maximize write endurance in FLASH / NVM
Predict which objects are worth writing to 2nd tier storage from memory
Last updated
Was this helpful?