Learning in situ: a randomized experiment in video streaming

https://www.usenix.org/system/files/nsdi20-paper-yan.pdf

Presentation

  • Video streaming dominates internet traffic

  • Adaptive bitrate (ABR) top optimize users' quality of experience (QoE)

    • Decides the quality level of each video chunck to send

    • Primary goals: higher video quality, fewer stalls

    • Prior work: BBA, MPC, CS2P, Pensieve, Oboe

  • What does it take to create a learned ABR algorithm that robustly performs well over the wild internet?

    • Confidence intervals in video streaming are bigger than expected

      • Puffer: a live streaming platform running a randomized experiment

      • Randomized experiments (one of the ABR scheme being tested)

      • Existing ABR algorithms found benefits like 10%-20% based on experiments lasting hours or days between a few network nodes

      • Need 2 years of data per scheme are needed to measure 20% precision

        • Want higher video quality: y axis

        • And fewer stalls: x axis

        • Better QoE: up and to the right

        • Most schemes are statically indistinguishable (noise)

      • Reason: Internet is way more noisy and heavy-tailed than we thought

        • Only 4% of the 637,189 streams had any stalls

        • Distributions of throughputs and watch times are highly skewed

    • A simple (buffer-based) ABR algorithm performs better than expected

      • BBA [SIGCOMM '14]: simple buffer-based ABR algorithm

        • Buffer: allows a streamed video or media file to be loaded while the user is watching or listening to it. Pre-download and store video in a temporary cache before playback begins on whatever device you are using

      • Research baseline

      • Other work: MPC-HM [SIGCOMM '15]

        • Predicts throughput using the harmonic mean (HM) of past throughputs

          • assumes throughput can be modeled with HM

          • assumes transmission time = predicted throughput x chunk size?

        • But the observed throughput actually vary with chunk size, due to congestion control and varying bandwidth

      • Other work: Pensieve [SIGCOMM '17]

        • Reinforcement learning

          • Requires network simulators as training environments

          • Assumes training in simulation generalizes to wild Internet?

      • Comparison

        • No algorithm can perform better than BBA in both aspects

      • Algorithms that make fewer assumptions are perhaps more general

    • Our way of outperforming existing schemes is learning in situ (i.e. in place on the actual deployment environment)

      • Fugu uses classical model predictive control

      • Fugu replaces the throughput predictor in MPC-HM with a transmission time predictor

        • NN-based: predict how long it takes for a client to receive a given chunk. "How long would each chunk take?"

          • Input:

            • Size and transmission times of past chunks

            • Size of a chunk to be transmitted (not a throughput predictor)

            • Low-level TCP statistics (min RTT, RTT, CWND, packets in flight, delivery rate)

          • Output:

            • probability distribution over transmission time (not a point estimate)

              • Useful for maximize expected QoE

        • Training: supervised learning in situ (in place) on real data from deployment environment

          • Chunk-by-chunk series of each individual video stream

          • Chunk i: size, timestamp sent, timestamp acknowledged, TCP statistics right before sending

        • Learning in situ does not replay throughput traces or require network simulators

          • We don't know how to faithfully simulate the Internet

  • Refine: pensive (with puffer traces)

  • Learning in situ: directly simulate from the real environment

Last updated