Graph Embeddings and Neural Networks

Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs

Marius: Learning Massive Graph Embeddings on a Single Machine

P3: Distributed Deep Graph Learning at Scale


  • Graph / Tensor tasks

    • Memory intensive

    • Compute intensive

  • Motivation: GPUs not a good fit

    • Compute good

    • But scalability (limited memory)

  • Solution: CPUs

    • Not for compute

  • Solution: GPU + CPU

    • Not cost-effective

    • GPU idle waiting for CPU

  • Key insight: serverless fits our goals

    • Large # of parallel threads

    • Lost-cost, flexible pricing model

    • Fine grained: only pay for compute resources

    • Achieve high performance-per-dollar (value)

  • Challenge

    • Weak CPU, limited memory

      • Separate tasks

    • limited network

      • Pipeline

        • Waiting: not fully utilized pipeline

  • Serverless Optimizations

    • Task fusion

    • Tensor re-materialization

    • Tune number of Lambdas


  • GNN

    • High classification accuracy

    • Better generality

    • Lower computation complexity

    • Easier parallelism

  • GNN: combine graph operations with tensor operations

  • Acceleration Solutions

    • Graph Processing Framework

    • Deep Learning Frameworks

  • Input Extraction (graph)

    • Node degree

    • Embedding Dimensionality

    • Graph community

  • Input extraction (GNN model information)

    • Order of agg and update

    • Type

  • 2D workload management

Last updated