Enabling Hyperscale Web Services
http://today.wisc.edu/events/view/157998
Last updated
Was this helpful?
http://today.wisc.edu/events/view/157998
Last updated
Was this helpful?
Web services will not grow
Radical shift in hyperscale computing
Urgent need: redesign computer systems to enable futuristic web services
Now: target individual stack layers
Software stacks
Web service application --> fragmented into small units
OS & Software Stacks
Hardware
Custom hardware (E.x. GPU)
Smart NIC
Disconnected Research
Bridge the software and hardware work to enable future web services
How to redesign SW to be aware of the HW constraints?
How to re-architect HW for new S paradigms post-Moore?
Characterization: understand service behavior, overheads, emerging trends
Novel solutions: self-navigate design space based on characterization insights
Real systems
OSDI '18
User request
Front-end microservice
Hashing microservice
Ads microservice
Caching microservice
Ranking microservice
End-to-end response latency
Interacting in an extremely complicated way
Microsecond-scale system overheads significantly affect microservices
Microsecond overheads: accessing OS/NW
Urgent need to understand SW threading interactions with OS/NW
Software Threading Dimensions -- taxonomy of threading models
Block v.s. Poll
In-line v.s. dispatch
Synchronous v.s. asynchronous
Latency tradeoffs across threading models
Poll better than block
In-line poll faces contention; dispatch poll with one poller is best
Support increase in load
Dispatch block is best at high load as it does not waste CPU cycles
No single threading model works best at all loads!
Automatic Load Adaption
Exploit trade-offs among threading models at run-time
micro-Tune [OSDI '18]
Design challenges: synchronization, when to switch, interact with NW, thread hops, scale thread pools?
Abstracts threading design from service code to seamlessly manage threading
Piecewise linear model
Input?
Load, and other features
Heuristics: queuing patterns and other
Ecosystem changes?
Will need to retrain
Diverse accelerators will break the bank
Customized platforms are expensive
Hardware homogeneity
Avoids testing overhead
Web
Feed1, Feed2
Ads1, Ads2
Cache1, Cache 2 (similar to key-value store)
Question
HW overheads?
SW overheads are worth building HW for?
Observation (HW)
Great diversity in overheads across microservices
New surprisingly dominant HW overheads
Code footprints (in code cache misses, TLB misses)
Observation (SW)
Orchestration: I/O processing, compression, encryption
Orchestration overheads are significant & common across microservices
Great diversity in HW overheads across microservices
Microservice orchestration logic is a significant, common SW overhead
Tune coarse HW & OS knobs on commodity HW
SoftSKUs achieve performance efficiency on cheap commodity HW
Serves 2.7B users
Saved cost
Reduced footprint
Accelerating the "encryption" orchestration logic
HW vender only improves the accelerators, ignoring the end-to-end overhead
SW interaction overhead
? Question
Accelerator (PCIE)
Offload, and context switches overhead
HW acceleration
Design, Test, Deploy
But performance is .. due to perf. bounds from software interactions with hardware
Analytical Model for HW Acceleration
Simplicity
Accelerometer: Analytical Model [ASPLOS' 20]
HW accelerator
SW threading design
Metrics
Offload transfer via interface
E.x. synchronous offload
Queuing delay, offload transfer latency...
E.x. asynchronous
Accelerator cycles do not critically affect speedup
But context switch penalty
Other
Waits for offload ack?
Offload size?
Offload preparation time?
Many threads? Context switches?
Accelerometer: estimates
TPU, GPU?
Type of accelerometer we can build
Opportunities
Stage of the HW pipeline
Type of the accelerator
Richer model to consider paradigm
Rethink I/O interactions in life of emerging HW and SW paradigms
I/O path efficient event notification mitigating microsecond stalls
Rethink SW for new HW emerging technologies: non-volatile memory, accelerators
New app paradigms & domains
Serverless: locality issues
AR/VR, IoT: data mgmt. issues
ML for SW-HW design, design space exploration resource management
scheduling, mgmt
Intersectionality, equity, fairness: first-order HW-SW design metrics
discriminate based on demographics, or ethic
Verify implementation hard?
slice and dice in way that still makes sense