Billion-scale Approximate Nearest Neighbor Search

Recorded video

Nearest Neighbor Search: NN

  • N D-dim database vectors

  • Given a query q, find the closest vector from the database

  • Solution: linear scan, O(ND), slow

  • Should try this first of all

Naive implementation

Faiss implementation

  • M = number of query vectors

  • SIMD (Single-Instruction Multiple Data)

    • also known as vectorization

  • BLAS (Basic Linear Algebra Subprograms)

  1. SIMD

Then go into the d >= 4 and d > 0 and the end of the function

  • SIMD codes of faiss are simple and easy to read

  • Being able to read SIMD codes comes in handy sometimes; why this impl is super fast


  • Norm: Euclidean norm

  • k-means: one of the benchmarks for KNN

  • Faster if using multiple GPU

Some useful reference

Approximate Nearest Neighbor Search: ANN

  • Faster search

  • Trade-off: runtime, accuracy, and memory consumption

  • A sense of scale: billion-scale data on memory


  • NN/ANN for CV

    • Image retrieval

    • Person re-identification

    • Clustering

    • kNN recognition

  • Fast construction of bag-of-features

  • One of the benchmarks is still SIFT

Locality Sensitive Hashing (LSH)

Hash function that we use, generally speaking:

Good sides:

  • Math-friendly

  • Popular in the theory area (FOCS, STOC, ...)

Bad sides:

  • Large memory cost

    • Need several tables to boost the accuracy

    • Need to store the original data (# = N) on memory

  • Data-dependent methods such as PQ are better for real-world data

  • Thus, in recent CV papers, LSH has been treated as a classic method


Tree / Space Partitioning



Graph Traversal

  • Very popular in recent years

  • Around 2017, it turned out that the graph-traversal-based methods work well for million-scale data

  • Pioneer

    • Navigable Small World Graphs (NSW)

    • Hierarchical NSW (HNSW)

  • Implementation: nmslib, hnsw, faiss

Record Phase

  • Early links can be long

    • In the early phase, there are not many vectors

  • Such long links encourage a large hop, making the fast convergence for search

Search Phase

Task: find the similar node for the query node

  • Then, traverse in a greedy manner

Extension: Hierarchical NSW; HNSW

  • Sub-sample from bottom to the top

NMSLIB (Non-Metric Space Library)

Other implementation of HNSW

  • Hnswlib

    • Spin-off library from nmslib

    • Include only hnsw

    • Simpler; may be useful if you want to extend hnsw

  • Faiss

    • Libraries for PQ-based methods

    • This lib also include hnsw

Other graph-based approaches

Compressed data

Basic idea

Product Quantization: PQ

  • Split the vector into three parts

  • For this first 2-D vector, find the closest in this codebook

  • Then, record the ID

  1. Memory Efficient

2. Distance Estimation

  • First, apply product quantization on the database

  • Then, get the PQ code

Deep PQ

Hamming-based v.s Lookup-based


Inverted Index + PQ: Record


Graph Traversal

Cheat sheet of choosing index


Nearest neighbor search engine

Problems of ANN

Last updated