ytgui / PilotANNLinks
Memory-Bounded GPU Acceleration for Vector Search
☆25Updated 3 months ago
Alternatives and similar repositories for PilotANN
Users that are interested in PilotANN are comparing it to the libraries listed below
Sorting:
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆65Updated last year
- Compression for Foundation Models☆33Updated 3 months ago
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆62Updated 9 months ago
- Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.☆80Updated 5 months ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆22Updated last year
- ☆36Updated 2 months ago
- Lottery Ticket Adaptation☆39Updated 7 months ago
- ☆14Updated this week
- A fast header-only graph-based index for approximate nearest neighbor search (ANNS). https://flatnav.net☆30Updated 2 weeks ago
- ☆17Updated this week
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated 3 months ago
- Samples of good AI generated CUDA kernels☆84Updated last month
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated last month
- A lightweight, user-friendly data-plane for LLM training.☆20Updated 2 weeks ago
- Latent Large Language Models☆18Updated 10 months ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆40Updated last year
- Official Repository for Task-Circuit Quantization☆20Updated last month
- Multi-Layer Key-Value sharing experiments on Pythia models☆34Updated last year
- Experiments to assess SPADE on different LLM pipelines.