# Workshop

### Learning at the Wireless Edge&#x20;

* Vince Poor (Princeton University)&#x20;
* Two aspects&#x20;
  * Using ML to optimize communication networks
  * Learning on mobile devices (the focus of today's talk)&#x20;
* Today's talk: focus on federated learning&#x20;
  * Motivation
  * Federated learning over wireless channels (scheduling)
  * Privacy protection in federated learning (differential privacy)
  * Some research issues&#x20;
* ML
  * Tremendous progress in recent years (more data, increase in computational power)
  * Standard ML: implement in centralized manner (data center, cloud), full access to the data
  * SOTA models:&#x20;
    * Standard software tools, specialized hardware&#x20;
* Wireless edge&#x20;
  * Centralized ML are not suitable for many emerging applications&#x20;
    * Self-driving cars, first responder networks, healthcare networks&#x20;
  * What makes the application different:
    * Data is born at the edge (phone, IoT devices)
    * Limited capacity uplinks
    * Low latency & high reliability&#x20;
    * Data privacy / security
    * Scalability & locality&#x20;
  * Motivate moving learning closer to the network edge&#x20;

![](/files/4qD1V0DAEdigMa01CGms)

* Federated learning over wireless channels (scheduling)&#x20;
  * ![](/files/q3BkGq3wRBK3dfHD3Lyg)
  * Wireless: communication to the AP needs to go through wireless channels&#x20;
    * Shared, resource-constrained&#x20;
      * Only limited number of devices can be selected in each update round
      * Transmissions are not reliable due to interference&#x20;
    * Questions&#x20;
      * How should we schedule devices to update the trained weights?&#x20;
      * How does the interference affect the training?&#x20;
  * Scheduling mechanisms&#x20;
    * Random scheduling: aggregator select N out of K users at random
    * Round Robin: divide into group
    * Proportional Fair: strongest SNRs
  * ![](/files/AKE3diBvDdJhUJYo9q0w)
  * ![](/files/HkehKwjhl46vcZnhE34w)
* Design metric: age of information (AoI)
  * Age-based scheduling scheme for federated learning in mobile edge networks&#x20;
  * Optimization algorithm in each iteration round&#x20;
  * Wireless round robin&#x20;
* Privacy in federated learning&#x20;
  * "privacy preserving": data remains on end-user devices&#x20;
  * But end-user data can be inferred from the parameter (or gradient) updates&#x20;
  * Approach: use differential privacy to protect end-user data&#x20;
    * Refers to a type of privacy in which two datasets, one with private information and one without it, but otherwise identical, cannot be distinguished by a statistical query (with high probability)&#x20;
  * Trade-off between privacy and accuracy&#x20;
* Other issues&#x20;
  * Model efficiency&#x20;
    * Resources on end-user devices are limited (e.g., energy, storage, computational power)
    * Trade-offs between # of layers, # of neurons per layer, accuracy&#x20;
  * Communication efficiency&#x20;
  * Limited data at the edge&#x20;
    * Local data is sparse
    * Incorporating domain and physical knowledge&#x20;
  * Security & Privacy
    * Robustness to malicious end-user devices & adversarial training examples
    * Other approaches to end-user privacy&#x20;

### Challenges in ML and the way forward&#x20;

* Ariela Zeira, Intel Labs&#x20;
* Challenges in DL
  * Compute efficiency&#x20;
  * Memory overhead
  * Data efficiency
  * Online learning&#x20;
  * Robustness&#x20;
  * Knowledge Representation&#x20;
* Hyper Dimensional Computing&#x20;
  * New paradigm for energy-efficient, noise-robust and fast alternatives to standard ML&#x20;

### Adventures in Learning-Based Rate Control&#x20;

* Brighten Godfrey, UIUC&#x20;
* TCP protocols: point solutions designed for specific environments, and far from optimal&#x20;
* Why does traditional CC architecture struggle?&#x20;
  * TCP Reno, CUBIC, FAST, Scalable, HTCP
  * "Hardwired" control actions
    * &#x20;Underlying conditions --> ideal control action&#x20;
    * ![](/files/rUGUzJto5TdRnFyUAyrR)
    * Not enough information about what's happening in the network&#x20;
* What is the right rate to send?&#x20;
  * Network is a blackbox: but we can send at some rate and see what happens&#x20;
    * Collect observations: throughput, loss rate, latency. We can summarize that in a utility function&#x20;
* A change in perspective&#x20;
  * Traditional perspective: simple network model + well-crafted rules --> predictable results&#x20;
  * "Black-box" perspective: the world is complex. Quantifying goal and observing effect of actions yields good decisions
  * A fit for learning!
    * Diverse, opaque environments
    * Only a trickle of information
    * Infer good action at millisecond timescales&#x20;
* Software components&#x20;
  * ![](/files/D7HUcBDx6EgeawXIuWta)
  * Paper: PCC&#x20;
  * Control algorithm: heuristic hill-climbing algorithm&#x20;
    * Noise in measurement? randomized controlled trials&#x20;
  * Where is the congestion control?&#x20;
    * Equilibrium depends on utility function&#x20;
    * Selfish utility-maximizing decision --> non-cooperative game&#x20;
* Promising performance&#x20;
* Upgrade: PCC Vivace (NSDI 2018)&#x20;
  * Leveraging powerful tools from online learning theory&#x20;
  * New utility function framework
    * Latency-awareness&#x20;
    * Strictly concave --> equilibrium guarantee
    * Weighted fairness among senders&#x20;
  * New control algorithm
    * Gradient-ascent --> convergence speed / stability
    * Deals with measurement noise&#x20;
  * Performance: great improvement in latency & responsiveness, but still suboptimal in extremely dynamic networks (i.e. wireless)&#x20;
* Deep RL on congestion control (ICML 2019)&#x20;

![Aurora agent architecture ](/files/sDy2lRLvKR5ZKXYM5FaG)

* scale-free values to aid robustness&#x20;
* History length: what lengths work well&#x20;
* Training&#x20;
  * Simulated environment
    * Order of magnitude faster than emulation
    * Each episode chooses link parameters from a range&#x20;
  * Setting appropriate discount factor&#x20;
    * Maximize expected cumulative discounted return&#x20;
* Future
  * Multi-gent scenarios: training, competition
  * Online training: challenge is to improve outcomes with limited additional training data&#x20;
* New uses&#x20;
  * Scavenger transport&#x20;
    * SIGCOMM 2020: Proteus: Scavenger Transport and Beyond&#x20;
    * Different applications: software updates, CDN warmup, cloud storage replication, online video, real-time streaming, search&#x20;
      * Elastic timing, inelastic timing&#x20;
    * Scavenger design goals&#x20;
      * Yielding: minimally impact primary flows
      * Performance: high utilization, low latency when only scavengers exist
      * Flexibility: dynamically switch, avoid separate implementation&#x20;
    * Utility functions&#x20;
      * Primary, Scavenger, Hybrid&#x20;
      * RTT deviation as a competition indicator&#x20;
        * Definition: standard deviation of observed RTT samples&#x20;
        * Intuition: earlier signal of dynamics of buffer occupancy during flow competition&#x20;
        * Proteus: yields more effectively&#x20;
      * Cross-layer design&#x20;
        * Dynamic threshold based on the application requirement (buffer occupancy)&#x20;
        * Hybrid mode: to get the bandwidth when they need it&#x20;
        * Improving QoE (rebuffer raito)&#x20;
* Rate control robustness (HotNets 2019)
  * Keeping up with rapid change&#x20;
    * Recent acceleration of innovation in rate control
    * ![](/files/8658lTiixQneztNYQmm1)
    * Approach: ML as an adversary&#x20;
      * ![](/files/c8IFC4pz7twQhtsPjmFB)
      * It gets rewarded when it finds environmental parameters that cause this algorithm under test to perform poorly (looking for the hard cases)
        * Do this carefully, rewarded for suboptimal performance of the algorithm, and smoothing to make it more useful results to see&#x20;
      * Reward = -1 \* protocol score + optimal score - smoothing penalty&#x20;
      * Implement: ABR video, CC&#x20;
* Lessons learned&#x20;
  * What worked&#x20;
    * Modular architecture
      * New control algorithms
      * New utility functions open new issues&#x20;
    * Learning-based control can improve performance over traditional protocols&#x20;
      * Industry implementations&#x20;
  * Open challenges & opportunities&#x20;
    * Performance: fast decisions v.s careful decisions&#x20;
      * Even one RTT can be a long time - especially in fluctuating wireless environments&#x20;
    * Understanding protocol robustness&#x20;
    * Existing opportunities in system design
      * Complexity in systems, environments, and application needs lead to opportunity for inference&#x20;
      * Restructure systems for a learning mindset&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sliu583.gitbook.io/blog/specific-work/seminar-and-talk/reading-groups/network-reading-group/ml-and-networking/workshop.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
