facebookresearch / distributed-faissLinks
A library for building and serving multi-node distributed faiss indices.
☆273Updated 2 years ago
Alternatives and similar repositories for distributed-faiss
Users that are interested in distributed-faiss are comparing it to the libraries listed below
Sorting:
- Some useful tips for faiss☆629Updated 3 months ago
- Automatically create Faiss knn indices with the most optimal similarity search parameters.☆890Updated last month
- ⚡ A fast embedded library for approximate nearest neighbor search☆235Updated 2 years ago
- hnsw implemented by python☆71Updated 6 years ago
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆345Updated last year
- Scalable training for dense retrieval models.☆298Updated 6 months ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆433Updated 3 years ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆197Updated last year
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆317Updated 2 years ago
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆420Updated last year
- Search Engines with Autoregressive Language models☆294Updated 2 years ago
- A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!☆113Updated 2 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆157Updated 2 years ago
- Code repository for the paper - "Matryoshka Representation Learning"☆586Updated last year
- The pipeline for the OSCAR corpus☆174Updated last month
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Updated 4 years ago
- Inquisitive Parrots for Search☆199Updated 6 months ago
- Pipeline for pulling and processing online language model pretraining data from the web☆178Updated 2 years ago
- Pure python implementation of product quantization for nearest neighbor search☆356Updated 6 months ago
- A multilingual version of MS MARCO passage ranking dataset☆145Updated 2 years ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆210Updated last year
- Task-based datasets, preprocessing, and evaluation for sequence models.☆591Updated last month
- This project studies the performance and robustness of language models and task-adaptation methods.☆155Updated last year
- ☆413Updated 2 years ago
- Efficient, check-pointed data loading for deep learning with massive data sets.☆210Updated 2 years ago
- ☆87Updated 3 years ago
- Build Text Rerankers with Deep Language Models☆263Updated last year
- docTTTTTquery document expansion model☆373Updated 2 years ago
- A scalable & efficient active learning/data selection system for everyone.☆217Updated last year
- Code for "SemDeDup", a simple method for identifying and removing semantic duplicates from a dataset (data pairs which are semantically s…☆150Updated 2 years ago