facebookresearch / distributed-faissLinks
A library for building and serving multi-node distributed faiss indices.
☆268Updated last year
Alternatives and similar repositories for distributed-faiss
Users that are interested in distributed-faiss are comparing it to the libraries listed below
Sorting:
- Automatically create Faiss knn indices with the most optimal similarity search parameters.☆859Updated last year
- ⚡ A fast embedded library for approximate nearest neighbor search☆230Updated last year
- Some useful tips for faiss☆622Updated last year
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆331Updated 6 months ago
- Scalable training for dense retrieval models.☆298Updated 2 weeks ago
- CUDA implementation of Hierarchical Navigable Small World Graph algorithm☆159Updated 4 years ago
- ☆411Updated last year
- Running BERT without Padding☆471Updated 3 years ago
- Build Text Rerankers with Deep Language Models☆263Updated last year
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆313Updated 2 years ago
- Framework for evaluating ANNS algorithms on billion scale datasets.☆379Updated last month
- DSIR large-scale data selection framework for language model training☆251Updated last year
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆395Updated last year
- Fast Inference Solutions for BLOOM☆564Updated 8 months ago
- The Triton backend for the ONNX Runtime.☆153Updated last week
- Task-based datasets, preprocessing, and evaluation for sequence models.☆582Updated last month
- Search Engines with Autoregressive Language models☆288Updated 2 years ago
- Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.☆211Updated last year
- Pure python implementation of product quantization for nearest neighbor search☆345Updated 2 weeks ago
- Code repository for the paper - "Matryoshka Representation Learning"☆507Updated last year
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆64Updated last year
- Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03…☆539Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆191Updated 10 months ago
- ☆119Updated last year
- Code for "SemDeDup", a simple method for identifying and removing semantic duplicates from a dataset (data pairs which are semantically s…☆136Updated last year
- Codebase for RetroMAE and beyond.☆263Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆178Updated last year
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆433Updated 2 years ago
- The pipeline for the OSCAR corpus☆169Updated last year