Redesigning Storage Systems for Future Workloads Hardware and Performance Requirements
https://www.youtube.com/watch?v=UFxS2fepBLk
Redesigning Storage Systems
New: applications, hardware, performance requirements
Applications
Billions of impressions collected by most web services for analytics
Past workloads: read mostly, rarely updated
Future workloads: example of IoTs
Mixed workload
Read / write ratio
Reads: read large volumes of sensor, data for control and analytics
Writes: new sensor data, continuously recorded
We produce huge amount of data: grow exponentially
Need to figure out a way to serve data in persistent storage
Nature of the data is quite diverse: unstructured data
KVs: key to big, unstructured data
API: put(), get(), and scan()
Example: RocksDB, redis, cassandra
Hardware
Landscape: RAM, HDD, Tape (archive)
Now: NVM, SSD between layer of RAM and HDD; and Glass, DNA, ...
Focus: SSD
New SSDs are much faster
Random accesses almost as fast as sequential accesses
Performance requirements
Throughput
reads & writes / second
higher is better
should be steady
Latency
time it takes to read or write
lower is better
tail latency (e.g., 99th percentile)
High tail latency in KVs --> unpredictable performance. Difficult to provide QoS guarantees
Contributions
Applications: cluster metadata management. Mixed write-intensive workload.
TRIAD [ATC '17]: decrease maintenance I/O to increase client throughput.
HyperLogLog-driven compaction
WAL repurposing
Hot/Cold key separation
Hardware: new servers with ample memory sizes (100s GB)
FloDB: Scale KVs with memory size
New 2-level data structure
Mostly O(1) insert
Concurrent reads and updates
Performance requirements
Need low, steady tail latency. No latency spikes.
SILK: KVs I/O scheduling
Opportunistic I/O bandwidth allocation for KV maintenance ops
Prioritize client work
No tail latency spikes
KVell: fast KV for NVMe SSDs (SOSP 19')
RocksDB: low, unstable throughput. CPU-bound on Fast SSDs.
Can't saturate the I/O bandwidth that comes with the device
Existing KVs internals
Last updated
Was this helpful?