Mycelium: Large-Scale Distributed Graph Queries with Differential Privacy


  • Scenarios: infection (contact-tracing) system

    • Sensitive data in distributed fashion

  • Differential privacy (DP)

    • Guarantee that output of queries doesn't leak (too much) information about participating users

    • Add randomness to the system

    • But doesn't tell us how to compute this securely!

  • Distributed protocol

    • Gives a decentralized solution without placing trust in any party, can handle a variety of ML queries

    • But:

      • doesn't handle graph queries where data isn't nicely broken down by user

      • protect the adversary from learning the graph (which is sensitive too)

      • need to make sure participants don't learn their neighbors' sensitive data in the process

  • Insights

    • Many queries fit a 2-step structure

      • compute some statistic for each user based on that immediate neighborhood

      • aggregate the statistic across the entire user base

    • We can route user communication using a semi-centralized mix network

    • Homomorphic encryption (FHE) for node communication to protect neighbor data

      • Allows for local aggregation and binning in each neighborhood

Last updated