Semeru: A memory-disaggregated managed runtime
Motivation

All resources sitting in different resource pools
CPU servers: small memory
Memory servers: not high processing powers, handling memory accesses
Today usage?
Storage servers: weak core
Network: InfiniBand, high bandwidth

Resource is not fully utilized
This can help all the workloads to share resource better
Multiplex different people's workload
Moving towards
But most cloud services are still pre-configured

Don't understand the actual workload
Managed languages:
Program relies on language runtime (sit between user runtime and OS) to manage memory
Java (objects), how these are allocated (memory), the allocated memory is not contiguous (?) digging into JVM

Write something in C, manage memory allocation
JVM takes care of tracing in the memory

Takes over CPU and bandwidth

Remote memory, two Spark applications (graph like applications)
Cache Ratio
50%/75% goes remote
No swap: best case (baseline), all accesses go locally

GC work
Not much compute (suitable for memory server compute power)
Data loaded locally to the memory server, cheaper
Run concurrently with the program itself (not compete for resources)




Accessing the dirty page again?


Java
GC: find all objects that will never be used again
E.x. variable within the loop, can be cleaned after this loop
Tracing is not computationally heavy
Cheaper to store all tracing on local memory in memory server



Several rounds of GC:
If these objects are still there, likely to remain for longer period of time
Java does less cleaning to older objects, and spend more times on newly-created objects


Memory servers: recycling the regions of themselves
Data layout: reduce memory segmentation
#3: how to efficiently swap data

Control path onto different memory servers
Data Path
Not interfere with each other
Paging:
Managed by the OS
Runtime access the page?
Bypass the OS?
JVM has the mapping between the two
Map the virtual pages in runtime to physical allocation on the memory servers
LegoOS:
this kind of disaggregated architecture
Not support JVM
Existing OS, then for OS, not be able to distinguish (which to handle locally, which to handle on remote). Implement this, having more flexibility.
Language runtime knows the program much better instead of OS




JVM on disaggregated
GC implemented inside of JVM rather than doing this directly from the OS
Last updated
Was this helpful?