On the Use of ML for Blackbox System Performance Prediction

https://www.usenix.org/conference/nsdi21/presentation/fu

  • Performance prediction is increasingly important

    • Optimization, capacity planning, SLO-aware scheduling

    • F(parameter) --> performance

  • Challenges

    • Accurate: precise predictions

    • Simple / easy-to-use: in-depth understanding of the systems not required

    • General: works across a spectrum of workloads and applications

  • Can ML provide an accurate, general, and simple performance predictor?

  • This paper: a systematic and broad study on performance prediction

  • ML for system perf. prediction?

    • Start with the best-case scenario

    • The best-case (BC) test

      • Given parameters, learn

    • ML assumptions

      • One-feature-at-a-time: e.g., vary P2, keeping P1, P3, ..., Pk fixed

      • Seen-config

    • System assumptions

      • No-contention: dedicated EC2 instances, isolated experiments

      • Identical-inputs: same input data for a given input dataset size

  • Applications and models

    • ML models: NN, LR, RF, SVM, NN

  • Metrics and predictors

    • Accuracy metric: rMSRE

    • ML predictors --> best-of-model / BoM-err

      • rMSRE of the most accurate model

    • Oracle predictor --> O-err

      • Allow Oracle to peek at both the error function and test data

  • Best case test results

    • High oracle error even under our best-case setup!

  • Methodology

    • Root cause for each of the applications

    • Fix?

      • With system modifications

        • For all applications, oracle error is now well within 10%!

        • Best-of-model error likewise

    • Trade-off between predictability and other design goals!

    • E.g., disabling an optimization can lead to higher prediction accuracy but degraded performance

    • These fixes require in-depth understanding of the app. and reasoning about the trade-offs!

  • Embrace variability: probabilistic predictions

    • Idea: predicting a mixture distribution instead of a single value

    • Then, use the "modes" of each distribution as the "top-k" prediction value

    • ML: mixture density networks and probabilistic random forest

    • Significant decrease in BoM-err with top-3 (k=3) predictions!

  • So far, best-case setup only

    • Go beyond?

    • Prediction errors can remain high if the underlying performance trend is difficult to learn

  • Conclusion

    • Taken "out of the box", many apps exhibit a surprisingly high degree of irreducible error

    • We can significantly improve the accuracy if we accept the loss of simplicity and / or generality

      • Modify applications

      • Modify predictions

      • ... but they don't work in all cases

    • Need a more nuanced methodology for applying ML

Last updated