Redesigning Storage Systems for Future Workloads Hardware and Performance Requirements

https://www.youtube.com/watch?v=UFxS2fepBLk

Redesigning Storage Systems

  • New: applications, hardware, performance requirements

  • Applications

    • Billions of impressions collected by most web services for analytics

    • Past workloads: read mostly, rarely updated

    • Future workloads: example of IoTs

      • Mixed workload

      • Read / write ratio

      • Reads: read large volumes of sensor, data for control and analytics

      • Writes: new sensor data, continuously recorded

    • We produce huge amount of data: grow exponentially

    • Need to figure out a way to serve data in persistent storage

    • Nature of the data is quite diverse: unstructured data

      • KVs: key to big, unstructured data

      • API: put(), get(), and scan()

      • Example: RocksDB, redis, cassandra

  • Hardware

    • Landscape: RAM, HDD, Tape (archive)

    • Now: NVM, SSD between layer of RAM and HDD; and Glass, DNA, ...

    • Focus: SSD

      • New SSDs are much faster

      • Random accesses almost as fast as sequential accesses

  • Performance requirements

    • Throughput

      • reads & writes / second

      • higher is better

      • should be steady

    • Latency

      • time it takes to read or write

      • lower is better

      • tail latency (e.g., 99th percentile)

        • High tail latency in KVs --> unpredictable performance. Difficult to provide QoS guarantees

Contributions

  • Applications: cluster metadata management. Mixed write-intensive workload.

    • TRIAD [ATC '17]: decrease maintenance I/O to increase client throughput.

      • HyperLogLog-driven compaction

      • WAL repurposing

      • Hot/Cold key separation

  • Hardware: new servers with ample memory sizes (100s GB)

    • FloDB: Scale KVs with memory size

      • New 2-level data structure

      • Mostly O(1) insert

      • Concurrent reads and updates

  • Performance requirements

    • Need low, steady tail latency. No latency spikes.

    • SILK: KVs I/O scheduling

      • Opportunistic I/O bandwidth allocation for KV maintenance ops

      • Prioritize client work

      • No tail latency spikes

  • KVell: fast KV for NVMe SSDs (SOSP 19')

    • RocksDB: low, unstable throughput. CPU-bound on Fast SSDs.

      • Can't saturate the I/O bandwidth that comes with the device

    • Existing KVs internals

Last updated