eBPF: rethinking the linux kernel

https://www.youtube.com/watch?v=f-oTe-dmfyI

Programmability essentials
- Safety --> sandboxing
- Continuous Delivery --> deploy anytime with seamless upgrades
- Performance --> native execution (JIT compiler)
Kernel architecture
- Kernel abstracts using driver: enable the H/W, but don't want to expose
  - Block device, network device
- System calls: application invokes to communicate with kernels
- Middle logic: business logic
  - Virtual file system
  - TCP / IP
- Last piece: someone operates the system, through configuration APIs
  - Interact from the kernel through the APIs

Option 1: native support
- Change kernel source code
- Expose configuration API
- Wait 5 years for your users to upgrade
- Cons: nobody wants to wait
Option 2: kernel module
- Write kernel module
- Every kernel release will break it
- Cons
  - You likely to ship a different module for each kernel version
  - Might crash your kernel

eBPF
- Take that syscall, and run a program that takes over on behalf of the system call and then returns
- Extract the metadata from the system call, and send that through a bpf map, for tracing purpose and provide context
eBPF runtime: how does that work?
- Runtime: ensure that we fulfill all the programmability essentials that we cover earlier
- BPF bytecode: the compiled version of the code above
  - Safety & security: the verifier will reject any unsafe program and provides a sandbox
    Major difference compared to the Linux module
    Privilege, access / expose control
    Similar to JS (software-based sandbox)
  - Performance: JIT compiler --> ensures native execution performance
    Portable
  - Continuous Delivery
    Programs can be exchanged without disrupting workloads

What can you hook?
- Kernel functions (kprobes)
- Userspace functions (uprobes)
  - Functions in your application! Profile application
- System calls
- Tracepoints
  - Function names in kernel that will stay stable
  - Instrument the entire Linux kernel
- Network devices (tc / xdp)
- Network routes
- TCP congestion algorithms
- Sockets (data level)

BPF program: only instructions, no data
States are stored in BPF maps, separate from the programs
- Keep the maps alive, while replacing the programs
  - E.x. LPM --> routing table
- Seamless upgrades
Used for
- Retrieve and configure

Linux module can call any kernel functions
- Downsides: abuse or misuse --> crash the kernel, and are not stable
BPF programs
- Helpers: used to interact with OS, and they are stable over time
- Interactions with OS are done via helpers
- Portable across kernel versions

Tail calls: chain, and will not return to the old programs
- Hook: run multiple logical pieces
Tail / function calls
- Composable
- Reduce the size of the programs

Last updated 3 years ago

Was this helpful?