MOSAIC: Processing a Trillion-Edge Graph on a Single Machine
https://dl.acm.org/doi/pdf/10.1145/3064176.3064191
Limit by disk size not memory
Out-of-core
Trillion-edge graph (8TB overall)
Large-scale graph processing is ubiquitous
Social network, genome analysis, graphs enable machine learning
Powerful, heterogeneous machines
Terabytes of RAM on multiple sockets
Powerful many-core coprocessors (Xeon Phi processor)
Specific? Or applicable to GPU as well?
Fast, large-capacity non-volatile memory
Goal: take advantages of heterogeneous machine to process tera-scale graphs
Graph Processing: Sample Application
Community Detection
Find Common Friends
Find Shortest Paths
Estimate Impact of Vertices (webpages, users, ...)
...
Design
Design space
Single Machine
Out-of-core: cheap, but potentially slow
In memory: fast, but limited graph size
Cluster
Out-of-core: large graphs, but expensive & slow
In memory: large graphs, fast, but expensive
Single machine, out-of-core: cost-effective
10? higher memory more expensive? What is the ratio? Any evaluation?
Validate this
Goal: good performance and large graphs
Goal: Run algorithms on large graphs on a single machine using coprocessors
Enabled by
Common, familiar API (vertex / edge-centric)
Encoding: Lossless compression
Cache locality
Processing on isolated subgraphs
Divided into small subgraphs
Do a massively parallel using cores and hyper-threads
Xeon: host CPU
8 GB of high bandwidth memory
Access pattern
Subgraphs: tiles
At most 2^16 vertices
Local identifier: 2 bytes + level of indirection
Cache locality
Inside subgraphs: sort by access order
Between subgraphs: overlap vertex sets
Produce the tiles
Evaluation
Re-run the preprocessing step
Generate incrementally?
No
Not supported multiple combines
Execution model (when they place the tiles, inter-core communication? Or independent?)
Conclusion
Same gain to put in the multi-core CPUs?
Use the principle
Spread computation
Hybrid systems?
Taking benefits on tiling
Along with offloading to GPUs
Think about:
Workloads, technologies, trends. Design something that benefits the problem they have.
Last updated
Was this helpful?