Reranking is a crucial step in the retrieval process of a vector database. After an initial retrieval of relevant items based on vector similarity, reranking involves reordering the retrieved items using more sophisticated models or additional features to improve the relevance of the results.
| Type | Accuracy (typical NDCG@10) | Latency | Cost | Best For | Drawbacks |
|---|---|---|---|---|---|
| Cross-Encoder | High (0.80–0.90+) | Medium (50–500 ms per pair) | Low (self-hosted) / Medium (API) | Production RAG, most common choice | Slower than bi-encoders |
| Lightweight / Optimized Cross-Encoder | High | Low–Medium | Very low | High-throughput, edge/AMD GPU setups | Slightly lower peak accuracy |
| LLM-based / Pointwise | Medium–High | High (1–5s+) | High (tokens) | Complex reasoning, zero-shot | Expensive, slow for scale |
| Multi-Vector / Late Interaction (e.g., ColBERT-style) | High | Medium | Low–Medium | Long docs, precise matching | More complex indexing |
These show up consistently in benchmarks (MTEB-R, CMTEB-R, BEIR variants, internal NDCG/MRR gains), leaderboards, Hugging Face trending, and production RAG stacks (LangChain, LlamaIndex, Haystack, etc.):
| Model | Organization | Architecture | Typical Use | Notes |
|---|---|---|---|---|
| Cohere Rerank | Cohere | Cross-encoder | RAG ranking | API-based, very popular |
| bge-reranker-large | Beijing Academy of Artificial Intelligence | Cross-encoder | Dense retrieval reranking | Very strong open-source model |
| monoT5 | Research (Castorini) | T5 seq2seq | Passage ranking | Strong MS MARCO performance |
| duoT5 | Research (Castorini) | Pairwise T5 | Passage ranking | More accurate but slower |
| ColBERT | Stanford | Late-interaction transformer | Retrieval + reranking | Very high ranking accuracy |
| cross-encoder/ms-marco-MiniLM-L-6-v2 | Hugging Face | Cross-encoder | Lightweight reranking | Fast and widely used |
| Jina Reranker v2 | Jina AI | Cross-encoder | Multilingual RAG | High multilingual accuracy |
| RankT5 | Google Research | T5 ranking model | Search ranking | Used in ranking tasks |