Data Management
DMon: Efficient Detection and Correction of Data Locality Problems Using Selective Profiling
CLP: Efficient and Scalable Search on Compressed Text Logs
Polyjuice: High-Performance Transactions via Learned Concurrency Control
Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing
DMon
Motivation
Single search query (~700ms): 8 processor cores
Millions of cores
Cost in management and energy cost
CPU performance
32% doing useful work
Other: fetching, decoding, bad speculation, execution units unavailable, memory stalls (poor data locality)
Existing techniques
Compiler Optimizations
Improve data locality via program transformation
Con: rely on static heuristic, can sometimes hurt performance
Dynamic profilers
Accurate execution information, identify and resolve poor locality
Con: mostly manual repair, high profiling overhead
Contribution
Selective profiling
Apply specific compiler optimizations based on profiling results
Evaluation
Efficiency
Effectiveness
Real-world case studies
Data locality problem that is not solved by optimization?
Tail latency problem
Can't identify and repair all
CLP
Workload / problems:
Logs --> search tools --> compress (i.e. gzip)
Problem
Perabytes of logs
Consume a lot of resources for uncompressed logs
Current solution
Build indexes: adds storage overhead
Total usage is the same magnitude
Can only retain indexed logs for a few weeks
Compress (2.27% of the original size)
Unsearchable once compressed
Requires decompression
Method key:
Lossless log compression: better than general-purpose compressors
Can search compressed logs
Without decompression
Only decompress matching results
With good performance
Method:
Domain-specific algorithm to extract the overlapping parts
De-duplicate
Log Type Dictionary
Variable Dictionary
Encoded Message
Comparison:
Search: ripgrep, ElasticSearch
Archieve tools: lzma, ...
How to change the structure? And prevent them from blowing down?
Compress logs to multiple archives
Different set of archives
Do the search? Affecting the performance?
Time is neglectable compared to the time to read
Accurate schema?
Yes
Based on the variable, tend to be quite general (generic schema)
Aren't the best? Search performance
Polyjuice
Motivation:
CC: control how concurrent executions interleave
Maximizing interleaving --> better performance
Currently, no single CC algorithm is the best depending on the workload (high-, low- contention)
Current solutions:
Coarse-grained approach to combine few known CC algorithms
Con
Cumbersome: requires manual partition of the workload
Suboptimal: uses a single CC within each partition
Approach:
RL
State
Differentiate state that require different CC actions
Two part
Type of transaction
Data accessed
Action
Should be able to encode most existing CC algorithms
Exert control on the interleaving
Solution
Whether to wait, and how long?
OCC: no wait
2PL: wait until dependent transactions commit
IC3, RP, DRP: wait until ... finish execution up to some point
Which versions of data to read?
Latest committed version
Latest published dirty version
Whether to make a dirty write visible?
Keep itself in the private buffer
Publish the dirty write
Whether to validate now, prior to commit?
Whether or not to validate after this access
Based on Silo's protocol
VEGITO
Workload:
Typical workloads
OLTP: short-term
OLAP: long-term
New type of workload: HTAP
Requirement:
Performance
Minimizes performance degradation
Millions of transactions per second
Freshness
Delay..
Data analysis: TP + ETL + AP
HTAP alternatives
Alternative 1: Dual system
Good performance
Large time delay
Alter 2: single-layout
Short time delay
Huge performance degradation
Alter 3: dual-layout
Lightweight sync.
Idea: VEGITO
High availability for fast in-memory OLTP
Replication-based
Synchronous log shipping
Challenge
Log cleaning
Need to be consistent and log-parallel
Epoch-based design
Gossip-style
Guarantee epoch assignment is consistent
MVCS
Multi-version column store (MVCS)
Different locality for read & write
column-wise v.s row-wise
Use block-based MVCS
Row-split
Col-merge
Tree-based index
Write-optimized tree index
Read-optimized index
Tree insert = in-place insert + balance (costly)
Balance in batch
Last updated