ytgui / PilotANNLinks
Memory-Bounded GPU Acceleration for Vector Search
☆28Updated last week
Alternatives and similar repositories for PilotANN
Users that are interested in PilotANN are comparing it to the libraries listed below
Sorting:
- Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.☆82Updated 9 months ago
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆65Updated 2 years ago
- [VLDB 25] Maximum Inner Product is Query-Scaled Nearest Neighbor☆32Updated 5 months ago
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆66Updated last week
- Compression for Foundation Models☆35Updated 3 months ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆26Updated last year
- Graph Library for Approximate Similarity Search☆132Updated last month
- ☆26Updated 6 months ago
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆168Updated 5 months ago
- Samples of good AI generated CUDA kernels☆91Updated 4 months ago
- ☆18Updated last month
- ☆58Updated 5 months ago
- A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such a…☆163Updated last month
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆79Updated 9 months ago
- Bamboo-7B Large Language Model☆93Updated last year
- A fast header-only graph-based index for approximate nearest neighbor search (ANNS). https://flatnav.net☆36Updated 3 months ago
- Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.☆94Updated last month
- Large Scale Search Index☆31Updated 2 years ago
- CUDA implementation of Hierarchical Navigable Small World Graph algorithm☆167Updated 4 years ago
- Cascade Speculative Drafting☆31Updated last year
- Collection of datasets for benchmarking filtered vector similarity retrieval☆52Updated 4 months ago
- ⚡ Faster similarity search with PDX: A vertical data layout for vectors☆58Updated 2 months ago
- Faster Learned Sparse Retrieval with Block-Max Pruning. ACM SIGIR 2024.☆31Updated last month
- ☆191Updated this week
- Official code for "Binary embedding based retrieval at Tencent"☆43Updated last year
- Self-host LLMs with LMDeploy and BentoML☆21Updated 3 months ago
- Modular and structured prompt caching for low-latency LLM inference☆101Updated 11 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆29Updated 6 months ago
- ☆82Updated 11 months ago
- GGNN: State of the Art Graph-based GPU Nearest Neighbor Search☆165Updated 8 months ago