criteo / autofaissLinks
Automatically create Faiss knn indices with the most optimal similarity search parameters.
☆854Updated last year
Alternatives and similar repositories for autofaiss
Users that are interested in autofaiss are comparing it to the libraries listed below
Sorting:
- A library for building and serving multi-node distributed faiss indices.☆266Updated last year
- Some useful tips for faiss☆622Updated last year
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆866Updated last year
- DataComp: In search of the next generation of multimodal datasets☆714Updated last month
- Blazing fast framework for fine-tuning similarity learning models☆656Updated last month
- SPLADE: sparse neural search (SIGIR21, SIGIR22)☆853Updated last year
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,005Updated 9 months ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆1,824Updated this week
- Community-maintained faiss wheel builder☆330Updated last week
- ⚡ A fast embedded library for approximate nearest neighbor search☆230Updated last year
- Library for 8-bit optimizers and quantization routines.☆716Updated 2 years ago
- SGPT: GPT Sentence Embeddings for Semantic Search☆868Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆737Updated 2 years ago
- Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03…☆537Updated last year
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆1,847Updated this week
- Code repository for the paper - "Matryoshka Representation Learning"☆499Updated last year
- OpenAI CLIP text encoders for multiple languages!☆799Updated 2 years ago
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,570Updated last year
- Task-based datasets, preprocessing, and evaluation for sequence models.☆574Updated 3 weeks ago
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,060Updated last year
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.☆613Updated 2 weeks ago
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.☆1,000Updated 10 months ago
- Tools to download and cleanup Common Crawl data☆1,013Updated 2 years ago
- 🤖 A PyTorch library of curated Transformer models and their composable components☆890Updated last year
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆328Updated 5 months ago
- Generative Representational Instruction Tuning☆640Updated 2 months ago
- Easily compute clip embeddings and build a clip retrieval system with them☆2,562Updated last year
- An open collection of implementation tips, tricks and resources for training large language models☆473Updated 2 years ago
- CLIP-like model evaluation☆721Updated last week