facebookresearch / distributed-faiss
A library for building and serving multi-node distributed faiss indices.
☆265Updated last year
Alternatives and similar repositories for distributed-faiss
Users that are interested in distributed-faiss are comparing it to the libraries listed below
Sorting:
- Automatically create Faiss knn indices with the most optimal similarity search parameters.☆854Updated 11 months ago
- Some useful tips for faiss☆620Updated last year
- Scalable training for dense retrieval models.☆292Updated 2 months ago
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆326Updated 5 months ago
- ⚡ A fast embedded library for approximate nearest neighbor search☆230Updated last year
- This project studies the performance and robustness of language models and task-adaptation methods.☆150Updated 11 months ago
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆311Updated 2 years ago
- DSIR large-scale data selection framework for language model training☆247Updated last year
- Build Text Rerankers with Deep Language Models☆262Updated last year
- Inquisitive Parrots for Search☆191Updated last year
- hnsw implemented by python☆66Updated 5 years ago
- Code for "SemDeDup", a simple method for identifying and removing semantic duplicates from a dataset (data pairs which are semantically s…☆136Updated last year
- experiments with inference on llama☆104Updated 11 months ago
- ☆411Updated last year
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆185Updated 9 months ago
- Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03…☆537Updated last year
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆64Updated last year
- Framework for evaluating ANNS algorithms on billion scale datasets.☆376Updated 2 weeks ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆170Updated 3 years ago
- Fast Inference Solutions for BLOOM☆561Updated 7 months ago
- Provides a common interface to many IR ranking datasets.☆352Updated last week
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- docTTTTTquery document expansion model☆365Updated 2 years ago
- Search Engines with Autoregressive Language models☆285Updated 2 years ago
- Scaling Data-Constrained Language Models☆334Updated 7 months ago
- CUDA implementation of Hierarchical Navigable Small World Graph algorithm☆158Updated 4 years ago
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆387Updated last year
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆218Updated 6 months ago
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.☆592Updated this week
- A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks☆371Updated last year