# Workshop

### Learning at the Wireless Edge&#x20;

* Vince Poor (Princeton University)&#x20;
* Two aspects&#x20;
  * Using ML to optimize communication networks
  * Learning on mobile devices (the focus of today's talk)&#x20;
* Today's talk: focus on federated learning&#x20;
  * Motivation
  * Federated learning over wireless channels (scheduling)
  * Privacy protection in federated learning (differential privacy)
  * Some research issues&#x20;
* ML
  * Tremendous progress in recent years (more data, increase in computational power)
  * Standard ML: implement in centralized manner (data center, cloud), full access to the data
  * SOTA models:&#x20;
    * Standard software tools, specialized hardware&#x20;
* Wireless edge&#x20;
  * Centralized ML are not suitable for many emerging applications&#x20;
    * Self-driving cars, first responder networks, healthcare networks&#x20;
  * What makes the application different:
    * Data is born at the edge (phone, IoT devices)
    * Limited capacity uplinks
    * Low latency & high reliability&#x20;
    * Data privacy / security
    * Scalability & locality&#x20;
  * Motivate moving learning closer to the network edge&#x20;

![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2Fxnxfj4GlS3KITh4mB252%2Fimage.png?alt=media\&token=ceae8f9c-ff40-4749-a4be-790c39ccd27f)

* Federated learning over wireless channels (scheduling)&#x20;
  * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FdPMcYq2LXUKxqN7JrFuV%2Fimage.png?alt=media\&token=3d9f3722-65f8-4b4e-9e98-43af16520da6)
  * Wireless: communication to the AP needs to go through wireless channels&#x20;
    * Shared, resource-constrained&#x20;
      * Only limited number of devices can be selected in each update round
      * Transmissions are not reliable due to interference&#x20;
    * Questions&#x20;
      * How should we schedule devices to update the trained weights?&#x20;
      * How does the interference affect the training?&#x20;
  * Scheduling mechanisms&#x20;
    * Random scheduling: aggregator select N out of K users at random
    * Round Robin: divide into group
    * Proportional Fair: strongest SNRs
  * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FuTSeZEQPnORCpzNQobHd%2Fimage.png?alt=media\&token=1f0af836-ad0f-4d64-8ecf-4899e0613d9e)
  * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FEWvVIXLm3cqVsfYz6SAJ%2Fimage.png?alt=media\&token=eb3931ea-99f0-44c6-b775-079a9ad1ae01)
* Design metric: age of information (AoI)
  * Age-based scheduling scheme for federated learning in mobile edge networks&#x20;
  * Optimization algorithm in each iteration round&#x20;
  * Wireless round robin&#x20;
* Privacy in federated learning&#x20;
  * "privacy preserving": data remains on end-user devices&#x20;
  * But end-user data can be inferred from the parameter (or gradient) updates&#x20;
  * Approach: use differential privacy to protect end-user data&#x20;
    * Refers to a type of privacy in which two datasets, one with private information and one without it, but otherwise identical, cannot be distinguished by a statistical query (with high probability)&#x20;
  * Trade-off between privacy and accuracy&#x20;
* Other issues&#x20;
  * Model efficiency&#x20;
    * Resources on end-user devices are limited (e.g., energy, storage, computational power)
    * Trade-offs between # of layers, # of neurons per layer, accuracy&#x20;
  * Communication efficiency&#x20;
  * Limited data at the edge&#x20;
    * Local data is sparse
    * Incorporating domain and physical knowledge&#x20;
  * Security & Privacy
    * Robustness to malicious end-user devices & adversarial training examples
    * Other approaches to end-user privacy&#x20;

### Challenges in ML and the way forward&#x20;

* Ariela Zeira, Intel Labs&#x20;
* Challenges in DL
  * Compute efficiency&#x20;
  * Memory overhead
  * Data efficiency
  * Online learning&#x20;
  * Robustness&#x20;
  * Knowledge Representation&#x20;
* Hyper Dimensional Computing&#x20;
  * New paradigm for energy-efficient, noise-robust and fast alternatives to standard ML&#x20;

### Adventures in Learning-Based Rate Control&#x20;

* Brighten Godfrey, UIUC&#x20;
* TCP protocols: point solutions designed for specific environments, and far from optimal&#x20;
* Why does traditional CC architecture struggle?&#x20;
  * TCP Reno, CUBIC, FAST, Scalable, HTCP
  * "Hardwired" control actions
    * &#x20;Underlying conditions --> ideal control action&#x20;
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FZZRksvMnjGto34PL94Pu%2Fimage.png?alt=media\&token=3b2dc274-0f57-4d6a-92c5-9cced6c38671)
    * Not enough information about what's happening in the network&#x20;
* What is the right rate to send?&#x20;
  * Network is a blackbox: but we can send at some rate and see what happens&#x20;
    * Collect observations: throughput, loss rate, latency. We can summarize that in a utility function&#x20;
* A change in perspective&#x20;
  * Traditional perspective: simple network model + well-crafted rules --> predictable results&#x20;
  * "Black-box" perspective: the world is complex. Quantifying goal and observing effect of actions yields good decisions
  * A fit for learning!
    * Diverse, opaque environments
    * Only a trickle of information
    * Infer good action at millisecond timescales&#x20;
* Software components&#x20;
  * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2F2kJafGj0DRJ4cNKq1GgJ%2Fimage.png?alt=media\&token=146c5011-e4b8-49c2-86cd-03057c3a40cf)
  * Paper: PCC&#x20;
  * Control algorithm: heuristic hill-climbing algorithm&#x20;
    * Noise in measurement? randomized controlled trials&#x20;
  * Where is the congestion control?&#x20;
    * Equilibrium depends on utility function&#x20;
    * Selfish utility-maximizing decision --> non-cooperative game&#x20;
* Promising performance&#x20;
* Upgrade: PCC Vivace (NSDI 2018)&#x20;
  * Leveraging powerful tools from online learning theory&#x20;
  * New utility function framework
    * Latency-awareness&#x20;
    * Strictly concave --> equilibrium guarantee
    * Weighted fairness among senders&#x20;
  * New control algorithm
    * Gradient-ascent --> convergence speed / stability
    * Deals with measurement noise&#x20;
  * Performance: great improvement in latency & responsiveness, but still suboptimal in extremely dynamic networks (i.e. wireless)&#x20;
* Deep RL on congestion control (ICML 2019)&#x20;

![Aurora agent architecture ](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FqP62I9AuYEUlt9PgRKNn%2Fimage.png?alt=media\&token=533403b0-11d7-43da-bd15-ee085b5d456b)

* scale-free values to aid robustness&#x20;
* History length: what lengths work well&#x20;
* Training&#x20;
  * Simulated environment
    * Order of magnitude faster than emulation
    * Each episode chooses link parameters from a range&#x20;
  * Setting appropriate discount factor&#x20;
    * Maximize expected cumulative discounted return&#x20;
* Future
  * Multi-gent scenarios: training, competition
  * Online training: challenge is to improve outcomes with limited additional training data&#x20;
* New uses&#x20;
  * Scavenger transport&#x20;
    * SIGCOMM 2020: Proteus: Scavenger Transport and Beyond&#x20;
    * Different applications: software updates, CDN warmup, cloud storage replication, online video, real-time streaming, search&#x20;
      * Elastic timing, inelastic timing&#x20;
    * Scavenger design goals&#x20;
      * Yielding: minimally impact primary flows
      * Performance: high utilization, low latency when only scavengers exist
      * Flexibility: dynamically switch, avoid separate implementation&#x20;
    * Utility functions&#x20;
      * Primary, Scavenger, Hybrid&#x20;
      * RTT deviation as a competition indicator&#x20;
        * Definition: standard deviation of observed RTT samples&#x20;
        * Intuition: earlier signal of dynamics of buffer occupancy during flow competition&#x20;
        * Proteus: yields more effectively&#x20;
      * Cross-layer design&#x20;
        * Dynamic threshold based on the application requirement (buffer occupancy)&#x20;
        * Hybrid mode: to get the bandwidth when they need it&#x20;
        * Improving QoE (rebuffer raito)&#x20;
* Rate control robustness (HotNets 2019)
  * Keeping up with rapid change&#x20;
    * Recent acceleration of innovation in rate control
    * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FaFvhUcKUcmPg6tfV79YD%2Fimage.png?alt=media\&token=6abbc7d6-5a31-44e3-9334-0db5eef9bae3)
    * Approach: ML as an adversary&#x20;
      * ![](https://2097630930-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MVORxAomcgtzVVUqmws%2Fuploads%2FlgoLBOFNp71uK84OpDYb%2Fimage.png?alt=media\&token=879875f9-8fdd-4b10-bead-155ec74356db)
      * It gets rewarded when it finds environmental parameters that cause this algorithm under test to perform poorly (looking for the hard cases)
        * Do this carefully, rewarded for suboptimal performance of the algorithm, and smoothing to make it more useful results to see&#x20;
      * Reward = -1 \* protocol score + optimal score - smoothing penalty&#x20;
      * Implement: ABR video, CC&#x20;
* Lessons learned&#x20;
  * What worked&#x20;
    * Modular architecture
      * New control algorithms
      * New utility functions open new issues&#x20;
    * Learning-based control can improve performance over traditional protocols&#x20;
      * Industry implementations&#x20;
  * Open challenges & opportunities&#x20;
    * Performance: fast decisions v.s careful decisions&#x20;
      * Even one RTT can be a long time - especially in fluctuating wireless environments&#x20;
    * Understanding protocol robustness&#x20;
    * Existing opportunities in system design
      * Complexity in systems, environments, and application needs lead to opportunity for inference&#x20;
      * Restructure systems for a learning mindset&#x20;
