# On the Use of ML for Blackbox System Performance Prediction

* Performance prediction is increasingly important&#x20;
  * Optimization, capacity planning, SLO-aware scheduling&#x20;
  * F(parameter) --> performance&#x20;
* Challenges&#x20;
  * Accurate: precise predictions&#x20;
  * Simple / easy-to-use: in-depth understanding of the systems not required&#x20;
  * General: works across a spectrum of workloads and applications&#x20;
* Can ML provide an accurate, general, and simple performance predictor?&#x20;
* This paper: a systematic and broad study on performance prediction&#x20;
* ML for system perf. prediction?&#x20;
  * Start with the best-case scenario&#x20;
  * The best-case (BC) test&#x20;
    * Given parameters, learn&#x20;
  * ML assumptions&#x20;
    * One-feature-at-a-time: e.g., vary P2, keeping P1, P3, ..., Pk fixed&#x20;
    * Seen-config
  * System assumptions&#x20;
    * No-contention: dedicated EC2 instances, isolated experiments&#x20;
    * Identical-inputs: same input data for a given input dataset size&#x20;
* Applications and models&#x20;
  * ML models: NN, LR, RF, SVM, NN&#x20;
* Metrics and predictors
  * Accuracy metric: rMSRE
  * ML predictors --> best-of-model / BoM-err
    * rMSRE of the most accurate model
  * Oracle predictor --> O-err&#x20;
    * Allow Oracle to peek at both the error function and test data&#x20;
* Best case test results&#x20;
  * High oracle error even under our best-case setup!&#x20;
* Methodology&#x20;
  * Root cause for each of the applications&#x20;
  * Fix?&#x20;
    * With system modifications
      * For all applications, oracle error is now well within 10%!
      * Best-of-model error likewise&#x20;
  * Trade-off between predictability and other design goals!
  * E.g., disabling an optimization can lead to higher prediction accuracy but degraded performance&#x20;
  * These fixes require in-depth understanding of the app. and reasoning about the trade-offs!&#x20;
* Embrace variability: probabilistic predictions&#x20;
  * Idea: predicting a mixture distribution instead of a single value&#x20;
  * Then, use the "modes" of each distribution as the "top-k" prediction value&#x20;
  * ML: mixture density networks and probabilistic random forest&#x20;
  * Significant decrease in BoM-err with top-3 (k=3) predictions!&#x20;
* So far, best-case setup only&#x20;
  * Go beyond?&#x20;
  * Prediction errors can remain high if the underlying performance trend is difficult to learn&#x20;
* Conclusion&#x20;
  * Taken "out of the box", many apps exhibit a surprisingly high degree of irreducible error&#x20;
  * We can significantly improve the accuracy if we accept the loss of simplicity and / or generality&#x20;
    * Modify applications&#x20;
    * Modify predictions&#x20;
    * ... but they don't work in all cases&#x20;
  * Need a more nuanced methodology for applying ML&#x20;
