ytgui / PilotANNLinks
Memory-Bounded GPU Acceleration for Vector Search
☆24Updated 2 months ago
Alternatives and similar repositories for PilotANN
Users that are interested in PilotANN are comparing it to the libraries listed below
Sorting:
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆64Updated last year
- A fast header-only graph-based index for approximate nearest neighbor search (ANNS). https://flatnav.net☆27Updated 2 weeks ago
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024☆61Updated 8 months ago
- Compression for Foundation Models☆31Updated 2 months ago
- Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.☆80Updated 4 months ago
- A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such a…☆145Updated last week
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆21Updated 11 months ago
- Collection of datasets for benchmarking filtered vector similarity retrieval☆43Updated this week
- state-of-the-art search over vector embeddings and structured data (SIGMOD '24)☆79Updated 3 months ago
- PostText is a QA system for querying your text data. When appropriate structured views are in place, PostText is good at answering querie…☆32Updated last year
- Samples of good AI generated CUDA kernels☆65Updated last week
- Port of Facebook's LLaMA model in C/C++☆21Updated last year
- ⚡ Faster vector search with PDX: A vertical data layout for vectors☆37Updated 2 weeks ago
- A repository for research on medium sized language models.☆76Updated last year
- Official Repository for Task-Circuit Quantization☆20Updated this week
- BH hackathon☆14Updated last year
- Latent Large Language Models☆18Updated 9 months ago
- Bamboo-7B Large Language Model☆93Updated last year
- Graph Library for Approximate Similarity Search☆118Updated 2 weeks ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 6 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated 2 months ago
- ☆10Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Algorithms for approximate nearest neighbor search with window filters☆40Updated last year
- This is a fork of SGLang for hip-attention integration. Please refer to hip-attention for detail.☆13Updated this week
- ☆33Updated last month
- Repository related to the Dynamic Exploration Graph and its previous iterations.☆25Updated 2 months ago
- ☆163Updated this week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year