Data center TCP (DCTCP)

https://dl.acm.org/doi/10.1145/1851182.1851192

Cloud data centers: especially soft real-time applications, generate mixing workloads
- Small predictable latency
- Large sustained throughput
In this environment, SOTA TCP protocol falls short

Higher throughput using less buffer space
High burst tolerance and low latency for short flows
Handles 10x the current background traffic, without impacting foreground traffic

Partition/Aggregate workflow pattern:

Need to continuously update internal data structures of the applications:

Measure and analyze production traffic from data centers whose network is comprised of commodity switches
- Impairments that hurt performance is identified, and linked to the properties of the traffic and switches
DCTCP that addresses these impairments to meet the need of applications
- Goal: switch buffer occupancies need to be persistently low, while maintaining high throughput for the long flow
- Use Explicit Congestion Notification (ECN)
- Combine ECN with a novel control scheme at the sources
  - Extract multibit feedback on congestion in the network from the single bit stream of ECN marks

Motivates why latency is a critical metric
1. Delay sensitive
all-up SLA
1. lagging instances of partition / aggregate can thus add up to threaten the all-up SLAs for queries
2. When a node misses its deadline, the computation continues without that response, lowering the quality of the result.
3. Many applications find it difficult to meet these deadlines using state-of-the-art TCP, so developers often resort to complex, ad-hoc solutions
Missed deadline: lower quality result

Shallow packet buffers cause three performance impairments
- Incast
  - if many flows converge on the same interface of a switch over a short period of time, the packets may exhaust either the switch memory or the maximum permitted buffer for that interface, resulting in packet losses.
  - This can occur even if the flow sizes are small
  - Partition/Aggregate design pattern: as the request for data synchronizes the workers’ responses and creates incast at the queue of the switch port connected to the aggregator

Last updated 4 years ago

Was this helpful?