Semeru: A memory-disaggregated managed runtime
Last updated
Last updated
All resources sitting in different resource pools
CPU servers: small memory
Memory servers: not high processing powers, handling memory accesses
Today usage?
Storage servers: weak core
Network: InfiniBand, high bandwidth
Resource is not fully utilized
This can help all the workloads to share resource better
Multiplex different people's workload
Moving towards
But most cloud services are still pre-configured
Don't understand the actual workload
Managed languages:
Program relies on language runtime (sit between user runtime and OS) to manage memory
Java (objects), how these are allocated (memory), the allocated memory is not contiguous (?) digging into JVM
Write something in C, manage memory allocation
JVM takes care of tracing in the memory
Takes over CPU and bandwidth
Remote memory, two Spark applications (graph like applications)
Cache Ratio
50%/75% goes remote
No swap: best case (baseline), all accesses go locally
GC work
Not much compute (suitable for memory server compute power)
Data loaded locally to the memory server, cheaper
Run concurrently with the program itself (not compete for resources)
Accessing the dirty page again?
Java
GC: find all objects that will never be used again
E.x. variable within the loop, can be cleaned after this loop
Tracing is not computationally heavy
Cheaper to store all tracing on local memory in memory server
Several rounds of GC:
If these objects are still there, likely to remain for longer period of time
Java does less cleaning to older objects, and spend more times on newly-created objects
Memory servers: recycling the regions of themselves
Data layout: reduce memory segmentation
Control path onto different memory servers
Data Path
Not interfere with each other
Paging:
Managed by the OS
Runtime access the page?
Bypass the OS?
JVM has the mapping between the two
Map the virtual pages in runtime to physical allocation on the memory servers
LegoOS:
this kind of disaggregated architecture
Not support JVM
Existing OS, then for OS, not be able to distinguish (which to handle locally, which to handle on remote). Implement this, having more flexibility.
Language runtime knows the program much better instead of OS
JVM on disaggregated
GC implemented inside of JVM rather than doing this directly from the OS