Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

Problem

  • A growing centralization of web systems can lead to single points of organizational failures. Thus "decentralized web" has been a growing movement to decentralize traditional web functionality (e.g. name lookup, hosting, certification) such that no individual administrative entity could hamper overall operations or design decisions. The core problem of any web platform is storing and serving media objects as scale.

Main insight

  • This paper presents the design, implementation, and deployment of the InternetPlanetary File System (IPFS), which is an entirely decentralized content-addressable media object storage and retrieval platform. It relies on four main concepts

    • 1) content-based addressing: IPFS detaches object names from host location, enabling objects to be served from any peer

    • 2) decentralized object indexing: IPFS relies on a decentralized P2P overlay for indexing all available locations from which objects can be retrieved reducing the impact of technical or organizational failures

    • 3) Immutability and self-certification: IPFS relies on cryptographic hashing to self-certify objects, removing the need for certificate- based authentication, hence, providing verifiability

    • 4) Open participations: anybody can deploy an IPFS node and participate in the network without requiring special permissions or privileges

  • It has also discussed the related measurement toolings to provide vantage into the decentralized operations on deployment, usage, and performance of the IPFS network.

Key strength

  • The design decisions of IPFS all reflect the goal for high decentralization. Structurally make sense.

  • The paper reveals some interesting findings, for example, fewer than 2.3% of IPFS nodes run in major cloud platforms, and most of them are running IPFS nodes on personal or on-premises commodity hardware, thus leading to high churn rate; also, there is a consolidation in a minority of ASes (i.e. top 10 ASes contain 64.9% of peers alone).

  • I really like the fact that they make evaluation data and tooling publicly available (with anonymizing the datasets)

  • The measurement methodologies can also be potentially helpful in terms of gaining greater longitudinal insight into the scale and performance of the several components of the IPFS architecture

Key weakness

  • What about comparisons to some other designs? Can we have a more detailed component-wise analysis? Even in simulations. It talks about it briefly in section 6.1 takeaways, but a more comprehensive set of comparisons across different workload scenarios might be interesting

  • It's mentioned in paper that IPFS relies on decentralized P2P overlay in a way such that the impact of technical or organizational failure can be reduced, but is failure measured / evaluated somewhere in the paper?

Comments

  • I wonder why Cantabo GmbH has higher share compared to the other cloud providers according to Table 3?

  • What is the take on the arguments that "IPFS is designed to offer publication and retireval delays capable of supporting a range of applications"? What are the examples of applications, potentially with characteristics? How does IPFS perform for different apps?

  • Can we use some of the available distribution on requests according to the map?

  • This figure is a really interesting one to show (can be useful later)

Last updated