Data Management

DMon: Efficient Detection and Correction of Data Locality Problems Using Selective Profiling

CLP: Efficient and Scalable Search on Compressed Text Logs

Polyjuice: High-Performance Transactions via Learned Concurrency Control

Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing



  • Single search query (~700ms): 8 processor cores

    • Millions of cores

    • Cost in management and energy cost

  • CPU performance

    • 32% doing useful work

    • Other: fetching, decoding, bad speculation, execution units unavailable, memory stalls (poor data locality)

Existing techniques

  • Compiler Optimizations

    • Improve data locality via program transformation

    • Con: rely on static heuristic, can sometimes hurt performance

  • Dynamic profilers

    • Accurate execution information, identify and resolve poor locality

    • Con: mostly manual repair, high profiling overhead


  • Selective profiling

  • Apply specific compiler optimizations based on profiling results


  • Efficiency

  • Effectiveness

  • Real-world case studies

Data locality problem that is not solved by optimization?

  • Tail latency problem

    • Can't identify and repair all


Workload / problems:

  • Logs --> search tools --> compress (i.e. gzip)

    • Problem

      • Perabytes of logs

      • Consume a lot of resources for uncompressed logs

    • Current solution

      • Build indexes: adds storage overhead

        • Total usage is the same magnitude

      • Can only retain indexed logs for a few weeks

      • Compress (2.27% of the original size)

        • Unsearchable once compressed

        • Requires decompression

Method key:

  • Lossless log compression: better than general-purpose compressors

  • Can search compressed logs

    • Without decompression

      • Only decompress matching results

    • With good performance


  • Domain-specific algorithm to extract the overlapping parts

    • De-duplicate

    • Log Type Dictionary

    • Variable Dictionary

    • Encoded Message


  • Search: ripgrep, ElasticSearch

  • Archieve tools: lzma, ...

How to change the structure? And prevent them from blowing down?

  • Compress logs to multiple archives

    • Different set of archives

  • Do the search? Affecting the performance?

    • Time is neglectable compared to the time to read

  • Accurate schema?

    • Yes

    • Based on the variable, tend to be quite general (generic schema)

      • Aren't the best? Search performance



  • CC: control how concurrent executions interleave

    • Maximizing interleaving --> better performance

  • Currently, no single CC algorithm is the best depending on the workload (high-, low- contention)

Current solutions:

  • Coarse-grained approach to combine few known CC algorithms

  • Con

    • Cumbersome: requires manual partition of the workload

    • Suboptimal: uses a single CC within each partition


  • RL

    • State

      • Differentiate state that require different CC actions

      • Two part

        • Type of transaction

        • Data accessed

    • Action

      • Should be able to encode most existing CC algorithms

      • Exert control on the interleaving

      • Solution

        • Whether to wait, and how long?

          • OCC: no wait

          • 2PL: wait until dependent transactions commit

          • IC3, RP, DRP: wait until ... finish execution up to some point

        • Which versions of data to read?

          • Latest committed version

          • Latest published dirty version

        • Whether to make a dirty write visible?

          • Keep itself in the private buffer

          • Publish the dirty write

        • Whether to validate now, prior to commit?

          • Whether or not to validate after this access

          • Based on Silo's protocol



  • Typical workloads

    • OLTP: short-term

    • OLAP: long-term

  • New type of workload: HTAP


  • Performance

    • Minimizes performance degradation

    • Millions of transactions per second

  • Freshness

    • Delay..

Data analysis: TP + ETL + AP

  • HTAP alternatives

  • Alternative 1: Dual system

    • Good performance

    • Large time delay

  • Alter 2: single-layout

    • Short time delay

    • Huge performance degradation

  • Alter 3: dual-layout

    • Lightweight sync.


  • High availability for fast in-memory OLTP

    • Replication-based

    • Synchronous log shipping

  • Challenge

    • Log cleaning

      • Need to be consistent and log-parallel

    • Epoch-based design

      • Gossip-style

        • Guarantee epoch assignment is consistent

    • MVCS

      • Multi-version column store (MVCS)

      • Different locality for read & write

        • column-wise v.s row-wise

      • Use block-based MVCS

        • Row-split

        • Col-merge

    • Tree-based index

      • Write-optimized tree index

      • Read-optimized index

      • Tree insert = in-place insert + balance (costly)

        • Balance in batch

Last updated