CtrlK

Support sparse computations in ML

Saman Amarasinghe (MIT)

History of Computing

The Gilded Era of Computing
- Driven by Moore's Law
- Performance was free
- Software Engineering Ruled
  - High level languages, abstraction, layering
  - Innovations, complex software systems, coupled with excess bloat
The Trustfund Era of Computing
- Moore's law is dead
- How bad is the bloat?
The Resurgent Era of Computing
- Why is sparsity in machine learning
  - Existing ML problems are sparse
    Replacing dense solvers with sparse can have a huge benefit
  - Next-generation ML problems can be sparse
    GNNs
  - Training ML models
    RW

Today

The world is built for dense

Hardware utilization

Dense

Peak Performance (GEMM)
- 70-80% of CPU
- 80-90% GPU
Optimization
- Prefetching, Branch, Predictions, TLB, cache

Sparse

Peak performance
- <10%

Programming System

Dense

Abstractions that work across different algorithms (dense linear algebra, image processing, deep learning, ...)
- BLAS, Halide, TensorFlow
- Optimizing Compilers

Sparse

What abstractions?

Tensor

Too many tensor kernels for a fixed-function library
Many non-binary expressions must be computed in a single kernel
- Sampled Dense-Dense Matrix Multiplication (SDDMM)

The Tensor Algebra Compiler (Taco)

Challenges
- Irregular data structures
  - Hierarchical Storage Abstractions
- Sparse iteration space with limited O(1) access
  - sparse Iteration graph
  - Code generation from iteration graph
- Avoid wasted work and iterations
  - Coiteration code generation
- Optimize Parallelism and Locality
  - Scheduling language

PreviousChips and Compilers Symposium NextSOSP 21

Last updated 4 years ago

Was this helpful?