# Automatically Discovering Machine Learning Optimizations

* ML computation is increasingly exponentially&#x20;
  * Amount of computation in largest ML training doubles every 3.4 months&#x20;
* Computation Graph of ML Model&#x20;
  * Nodes: tensor algebra operator (e.g., convolution, matrix mul)
  * Edge: tensor (high-dim array)
* ML systems are a key ingredient in ML&#x20;
  * Distributed heterogenous hardware architectures&#x20;
* Challenges of building ML systems&#x20;
  * Increasingly diverse models&#x20;
  * Increasingly heterogeneous hardware&#x20;
  * Summary&#x20;
    * Massively parallel: tensor algebra is parallelizable in many dimensions&#x20;
    * New ML operators: continuously introduced into ML systems&#x20;
    * Heterogenous hardware: different processor kinds and complex memory hierarchy&#x20;
  * Limitations&#x20;
    * Hard to manually design&#x20;
    * Suboptimal performance&#x20;
    * Correctness relies on manual proof&#x20;
* Catalyst: automated learning systems lab
  * Mission: automate the design and optimization of ML systems&#x20;
    * Statistical and mathematical properties of ML algorithms&#x20;
    * Domain knowledge of modern hardware platforms&#x20;
  * Graph optimizations: TASO, PET --> automated generation of graph optimizations&#x20;
  * Parallelization optimizations: FlexFlow, Dorylus (automated discovery of fast parallelization strategy)&#x20;
  * Data layout & placement: Lux, Roc
* Advantages of automated ML systems&#x20;
  * Better runtime performance: discovering novel optimizations hard to manually designed, 3-10x speedup over manual optimizations&#x20;
  * Less engineering effort: code for discovering optimizations is generally much less than manual implementation of these optimizations&#x20;
  * Stronger correctness guarantees: using formal verification techniques&#x20;
* TASO, PET&#x20;
  * Current rule-based graph optimizations&#x20;
  * Challenges of graph optimizations for ML: infeasible to manually design graph optimizations for all cases&#x20;
    * Operators, graph architectures, hardware backends&#x20;
  * TASO: tensor algebra super-optimizer&#x20;
    * Automated generation and verification of graph substitutions&#x20;
    * Workflow&#x20;
      * Operator specifications&#x20;
      * Graph subset generator&#x20;
        * Huge amount of graphs&#x20;
        * Explicitly considering all pairs does not scale --> compute output fingerprints, pairs of graphs with identical fingerprint are candidate substitutions&#x20;
      * Candidate substitution&#x20;
      * Graph substitution verifier&#x20;
      * Graph optimizer&#x20;
        * Based on individual operators' cost
        * Measure the cost of each operator on hardware&#x20;
        * Cost-based backtracking search&#x20;
  * PET: first tensor program optimizer with partially equivalent transformations&#x20;
    * Larger optimization space&#x20;
    * Better performance&#x20;
    * Correctness&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/specific-work/seminar-and-talk/fall-21-reading-list/automatically-discovering-machine-learning-optimizations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
