Learning in situ: a randomized experiment in video streaming
https://www.usenix.org/system/files/nsdi20-paper-yan.pdf
Last updated
https://www.usenix.org/system/files/nsdi20-paper-yan.pdf
Last updated
Video streaming dominates internet traffic
Adaptive bitrate (ABR) top optimize users' quality of experience (QoE)
Decides the quality level of each video chunck to send
Primary goals: higher video quality, fewer stalls
Prior work: BBA, MPC, CS2P, Pensieve, Oboe
What does it take to create a learned ABR algorithm that robustly performs well over the wild internet?
Confidence intervals in video streaming are bigger than expected
Puffer: a live streaming platform running a randomized experiment
Randomized experiments (one of the ABR scheme being tested)
Existing ABR algorithms found benefits like 10%-20% based on experiments lasting hours or days between a few network nodes
Need 2 years of data per scheme are needed to measure 20% precision
Want higher video quality: y axis
And fewer stalls: x axis
Better QoE: up and to the right
Most schemes are statically indistinguishable (noise)
Reason: Internet is way more noisy and heavy-tailed than we thought
Only 4% of the 637,189 streams had any stalls
Distributions of throughputs and watch times are highly skewed
A simple (buffer-based) ABR algorithm performs better than expected
BBA [SIGCOMM '14]: simple buffer-based ABR algorithm
Buffer: allows a streamed video or media file to be loaded while the user is watching or listening to it. Pre-download and store video in a temporary cache before playback begins on whatever device you are using
Research baseline
Other work: MPC-HM [SIGCOMM '15]
Predicts throughput using the harmonic mean (HM) of past throughputs
assumes throughput can be modeled with HM
assumes transmission time = predicted throughput x chunk size?
But the observed throughput actually vary with chunk size, due to congestion control and varying bandwidth
Other work: Pensieve [SIGCOMM '17]
Reinforcement learning
Requires network simulators as training environments
Assumes training in simulation generalizes to wild Internet?
Comparison
No algorithm can perform better than BBA in both aspects
Algorithms that make fewer assumptions are perhaps more general
Our way of outperforming existing schemes is learning in situ (i.e. in place on the actual deployment environment)
Fugu uses classical model predictive control
Fugu replaces the throughput predictor in MPC-HM with a transmission time predictor
NN-based: predict how long it takes for a client to receive a given chunk. "How long would each chunk take?"
Input:
Size and transmission times of past chunks
Size of a chunk to be transmitted (not a throughput predictor)
Low-level TCP statistics (min RTT, RTT, CWND, packets in flight, delivery rate)
Output:
probability distribution over transmission time (not a point estimate)
Useful for maximize expected QoE
Training: supervised learning in situ (in place) on real data from deployment environment
Chunk-by-chunk series of each individual video stream
Chunk i: size, timestamp sent, timestamp acknowledged, TCP statistics right before sending
Learning in situ does not replay throughput traces or require network simulators
We don't know how to faithfully simulate the Internet
Refine: pensive (with puffer traces)
Learning in situ: directly simulate from the real environment