# MOSAIC: Processing a Trillion-Edge Graph on a Single Machine

* Limit by disk size not memory&#x20;

* Out-of-core&#x20;

* Trillion-edge graph (8TB overall)&#x20;

* Large-scale graph processing is ubiquitous&#x20;
  * Social network, genome analysis, graphs enable machine learning&#x20;

* Powerful, heterogeneous machines&#x20;
  * Terabytes of RAM on multiple sockets&#x20;
  * Powerful many-core coprocessors (Xeon Phi processor)&#x20;
    * Specific? Or applicable to GPU as well?&#x20;
  * Fast, large-capacity non-volatile memory&#x20;
  * Goal: take advantages of heterogeneous machine to process tera-scale graphs&#x20;

### Graph Processing: Sample Application&#x20;

* Community Detection
* Find Common Friends
* Find Shortest Paths
* Estimate Impact of Vertices (webpages, users, ...)&#x20;
* ...

### Design&#x20;

* Design space&#x20;
  * Single Machine&#x20;
    * Out-of-core: cheap, but potentially slow&#x20;
    * In memory: fast, but limited graph size&#x20;
  * Cluster&#x20;
    * Out-of-core: large graphs, but expensive & slow &#x20;
    * In memory: large graphs, fast, but expensive&#x20;
* Single machine, out-of-core: cost-effective
  * 10? higher memory more expensive? What is the ratio? Any evaluation?&#x20;
  * Validate this&#x20;
* Goal: good performance and large graphs&#x20;
* Goal: Run algorithms on large graphs on a single machine using coprocessors&#x20;
  * Enabled by&#x20;
    * Common, familiar API (vertex / edge-centric)
    * Encoding: Lossless compression&#x20;
    * Cache locality&#x20;
    * Processing on isolated subgraphs&#x20;
      * Divided into small subgraphs&#x20;
      * Do a massively parallel using cores and hyper-threads&#x20;

![](/files/-MbqyuHmMrVWhtIP6ehU)

* Xeon: host CPU&#x20;
  * 8 GB of high bandwidth memory&#x20;
* Access pattern&#x20;

![](/files/-Mbr-9DjG4CRZXdvMzxz)

* Subgraphs: tiles&#x20;
  * At most 2^16 vertices&#x20;
  * Local identifier: 2 bytes + level of indirection&#x20;
* Cache locality&#x20;
  * Inside subgraphs: sort by access order&#x20;
  * Between subgraphs: overlap vertex sets&#x20;
* Produce the tiles&#x20;

### Evaluation&#x20;

![](/files/-Mbr2GhBNsEragBDKw4I)

* Re-run the preprocessing step&#x20;
* Generate incrementally?&#x20;
  * No

![](/files/-Mbr3174Riup_Ew-CAzg)

* Not supported multiple combines&#x20;

![](/files/-Mbr3l1TFKg-1aChlhcT)

![](/files/-Mbr4mAMHsY4MB1n7ivD)

![](/files/-Mbr4xRGxGUncJZSOjKW)

* Execution model (when they place the tiles, inter-core communication? Or independent?)&#x20;

### Conclusion&#x20;

![](/files/-Mbr5jkhx97XmGOha3yF)

* Same gain to put in the multi-core CPUs?&#x20;
  * Use the principle&#x20;
  * Spread computation&#x20;
  * Hybrid systems?&#x20;
    * Taking benefits on tiling&#x20;
    * Along with offloading to GPUs&#x20;

Think about:

* Workloads, technologies, trends. Design something that benefits the problem they have.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/specific-work/shivarams-group/group-papers/mosaic-processing-a-trillion-edge-graph-on-a-single-machine.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
