eBPF: rethinking the linux kernel


  • Programmability essentials

    • Safety --> sandboxing

    • Continuous Delivery --> deploy anytime with seamless upgrades

    • Performance --> native execution (JIT compiler)

  • Kernel architecture

    • Kernel abstracts using driver: enable the H/W, but don't want to expose

      • Block device, network device

    • System calls: application invokes to communicate with kernels

    • Middle logic: business logic

      • Virtual file system

      • TCP / IP

    • Last piece: someone operates the system, through configuration APIs

      • Interact from the kernel through the APIs

Kernel Development 101

  • Option 1: native support

    • Change kernel source code

    • Expose configuration API

    • Wait 5 years for your users to upgrade

    • Cons: nobody wants to wait

  • Option 2: kernel module

    • Write kernel module

    • Every kernel release will break it

    • Cons

      • You likely to ship a different module for each kernel version

      • Might crash your kernel

How about we add JS-like capabilities to the Linux Kernel?

  • eBPF

    • Take that syscall, and run a program that takes over on behalf of the system call and then returns

    • Extract the metadata from the system call, and send that through a bpf map, for tracing purpose and provide context

  • eBPF runtime: how does that work?

    • Runtime: ensure that we fulfill all the programmability essentials that we cover earlier

    • BPF bytecode: the compiled version of the code above

      • Safety & security: the verifier will reject any unsafe program and provides a sandbox

        • Major difference compared to the Linux module

        • Privilege, access / expose control

        • Similar to JS (software-based sandbox)

      • Performance: JIT compiler --> ensures native execution performance

        • Portable

      • Continuous Delivery

        • Programs can be exchanged without disrupting workloads

  • eBPF Hooks

  • What can you hook?

    • Kernel functions (kprobes)

    • Userspace functions (uprobes)

      • Functions in your application! Profile application

    • System calls

    • Tracepoints

      • Function names in kernel that will stay stable

      • Instrument the entire Linux kernel

    • Network devices (tc / xdp)

    • Network routes

    • TCP congestion algorithms

    • Sockets (data level)

eBPF Maps

  • BPF program: only instructions, no data

  • States are stored in BPF maps, separate from the programs

    • Keep the maps alive, while replacing the programs

      • E.x. LPM --> routing table

    • Seamless upgrades

  • Used for

    • Retrieve and configure

eBPF helpers

  • Linux module can call any kernel functions

    • Downsides: abuse or misuse --> crash the kernel, and are not stable

  • BPF programs

    • Helpers: used to interact with OS, and they are stable over time

    • Interactions with OS are done via helpers

    • Portable across kernel versions

eBPF Tail and Function calls

  • Tail calls: chain, and will not return to the old programs

    • Hook: run multiple logical pieces

  • Tail / function calls

    • Composable

    • Reduce the size of the programs


  • Tracing & Profiling with eBPF

    • BCC: BPF compiler collection

      • Allow application developers to run a python program, which contains the actual BPF program and the logic in python to read the state / metrics from BPF maps and displays it in some way

    • bpftrace - Dtrace for Linux

      • Creating BPF maps, read it, and so on

    • Cilium: networking, load-balancing, and security for Kubernetes

      • Network policies, Kubernetes services and so on

      • Introspect the data

Last updated