Billion-scale Approximate Nearest Neighbor Search
https://speakerdeck.com/matsui_528/cvpr20-tutorial-billion-scale-approximate-nearest-neighbor-search
Recorded video
Nearest Neighbor Search: NN

Naive implementation

Faiss implementation







Approximate Nearest Neighbor Search: ANN

Application

Locality Sensitive Hashing (LSH)



Falconn

Tree / Space Partitioning
FLANN

Annoy



Graph Traversal
Record Phase

Search Phase

Extension: Hierarchical NSW; HNSW

NMSLIB (Non-Metric Space Library)

Other implementation of HNSW
Other graph-based approaches

Compressed data
Basic idea



Product Quantization: PQ








Deep PQ


Hamming-based v.s Lookup-based

Billion-scale

Inverted Index + PQ: Record



Inverted Index + PQ: Search

Faiss


Graph Traversal


Cheat sheet of choosing index

Benchmark



Nearest neighbor search engine

Problems of ANN

PreviousQuery2box: Reasoning over Knowledge Graphs in Vector Space using Box EmbeddingsNextDiskANN: Fast accurate billion-point nearest neighbor search on a single node
Last updated