# GAIA: A System for Interactive Analysis on Distributed Graphs Using a High-Level Language

### Graph data are prevalent&#x20;

### Traversal on Property Graphs&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYAYh7BJRshkO-7_HAt%2Fimage.png?alt=media\&token=ba2ff5b2-7314-4970-a5dd-b6b3d80a8d82)

### Challenges of Large Graph Traversal at Alibaba&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYAYslUWmKZ_QQRnbEC%2Fimage.png?alt=media\&token=9a172670-34fe-4255-bfb4-e9ed09d447a7)

### Current State of the Art&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYAZGFq4qNOlvzejeM0%2Fimage.png?alt=media\&token=e4f8013a-5e7b-402f-b1f3-ef6c20b75085)

### Data-Parallel execution of Gremlin

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYA_F7yeqWCFhStch_L%2Fimage.png?alt=media\&token=bd5c9143-dadd-4b09-ae95-39f8d7f8d8d3)

### SCOPE Abstraction&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYAZwl4HyNjfc3YmRWC%2Fimage.png?alt=media\&token=06e91c4e-3404-41df-acef-c4175b43d976)

### Compilation of Control-Flow Constructs&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYA_91JO1ZJG51aX7q0%2Fimage.png?alt=media\&token=c9d63bca-ffa9-4ee6-bda1-8d0d67d5dda7)

### Dynamic Dependency Tracking&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYA_NsW7YNSWMSArl-h%2Fimage.png?alt=media\&token=2d874789-cd68-4437-ac46-64c2d121d132)

### Distributed Execution and Optimizations&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYA_YlUEeskd2WZe49P%2Fimage.png?alt=media\&token=675b6f29-d7f6-4075-8ece-a97cb74e05e0)

### Implementation and Evaluation&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAXF7zYTUkvEweTnoG%2F-MYAaCzcfMq80sntxjSi%2Fimage.png?alt=media\&token=e136a8f5-8563-4895-a492-856a948550af)

### Remarks&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MVORxAomcgtzVVUqmws%2F-MYAaF2GFvi6A0Vr_RIs%2F-MYAaSi9zfo1gA7ZBx7q%2Fimage.png?alt=media\&token=67e98aec-ee77-4515-bc50-fb555374ba66)

### Questions&#x20;

* Leverage multiple storage layers&#x20;
  * Now: in-memory store (immutable graph)
  * Production: enterprise feature&#x20;
    * Dynamic graph for updates
* Consistency graph
  * Snapshot is good enough&#x20;
* Why to pick the language?&#x20;
  * Users are not developers, are business experts&#x20;
  * Gremlin: for users to work with the graph&#x20;
    * Relative easy for target user&#x20;
