ytgui / PilotANNLinks
Memory-Bounded GPU Acceleration for Vector Search
☆27Updated 5 months ago
Alternatives and similar repositories for PilotANN
Users that are interested in PilotANN are comparing it to the libraries listed below
Sorting:
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆65Updated last year
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆65Updated 11 months ago
- [VLDB 25] Maximum Inner Product is Query-Scaled Nearest Neighbor☆31Updated 4 months ago
- Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.☆82Updated 8 months ago
- Compression for Foundation Models☆35Updated 2 months ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆24Updated last year
- Collection of datasets for benchmarking filtered vector similarity retrieval☆50Updated 3 months ago
- A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such a…☆161Updated last week
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆165Updated 4 months ago
- Graph Library for Approximate Similarity Search☆131Updated 2 weeks ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year
- Cascade Speculative Drafting☆30Updated last year
- Samples of good AI generated CUDA kernels☆90Updated 3 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 2 months ago
- Bamboo-7B Large Language Model☆93Updated last year
- Linear Attention Sequence Parallelism (LASP)☆86Updated last year
- code for training and using chess embeddings models☆12Updated last year
- CUDA implementation of Hierarchical Navigable Small World Graph algorithm☆165Updated 4 years ago
- Port of Facebook's LLaMA model in C/C++☆22Updated last year
- RWKV-7: Surpassing GPT☆95Updated 10 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆39Updated 2 months ago
- GGNN: State of the Art Graph-based GPU Nearest Neighbor Search☆164Updated 7 months ago
- Official code for "Binary embedding based retrieval at Tencent"☆43Updated last year
- ☆24Updated 5 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆88Updated 4 months ago
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆58Updated last year
- ⚡ Faster similarity search with PDX: A vertical data layout for vectors☆57Updated 3 weeks ago
- Repository for CPU Kernel Generation for LLM Inference☆26Updated 2 years ago
- ☆57Updated 3 months ago
- Faster Learned Sparse Retrieval with Block-Max Pruning. ACM SIGIR 2024.☆31Updated last month