amazon-science/piperag

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/amazon-science/piperag)

amazon-science / piperag

PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)

☆32

Alternatives and similar repositories for piperag

Users that are interested in piperag are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Liu-Cheng / graph_accelerator
View on GitHub
Graph accelerator on FPGAs and ASICs
☆11Aug 16, 2018Updated 7 years ago
microsoft / RetrievalAttention
View on GitHub
[VLDB 26, NeurIPS 25] Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.
☆147Feb 22, 2026Updated 5 months ago
Patrick-H-Chen / FINGER
View on GitHub
☆13Jan 1, 2024Updated 2 years ago
TobiasNorlund / retro
View on GitHub
Official repo to On the Generalization Ability of Retrieval-Enhanced Transformers
☆47Jun 4, 2024Updated 2 years ago
WenqiJiang / SC-ANN-FPGA
View on GitHub
☆26May 30, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
YZ-Cai / Unified-Navigating-Graph
View on GitHub
Official implementation for paper "Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search"
☆38Dec 21, 2024Updated last year
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆10Aug 19, 2025Updated 11 months ago
jalvarm / hcnng
View on GitHub
This is the implementation of the Hierarquical Clustering-based Nearest Neighbor Graphs
☆22Feb 6, 2020Updated 6 years ago
sail-sg / VocabularyParallelism
View on GitHub
Vocabulary Parallelism
☆26Mar 10, 2025Updated last year
MooreThreads / TurboRAG
View on GitHub
☆103Nov 25, 2024Updated last year
DiT-Serving / TetriServe
View on GitHub
[ASPLOS' 26] TetriServe: Efficiently Serving Mixed DiT Workloads
☆17Mar 12, 2026Updated 4 months ago
kunrenyale / CalvinFS
View on GitHub
CalvinFS project using C/C++
☆12May 25, 2017Updated 9 years ago
ZJU-DAILY / PSP
View on GitHub
[VLDB 25] Maximum Inner Product is Query-Scaled Nearest Neighbor
☆40Oct 31, 2025Updated 8 months ago
larsgottesbueren / gp-ann
View on GitHub
Experimental Code for "Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search"
☆30Nov 4, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Leo9660 / HedraRAG_AE
View on GitHub
Artifact Evaluation for SOSP 2025
☆21Aug 16, 2025Updated 11 months ago
YaoJiayi / CacheBlend
View on GitHub
☆199Jul 15, 2025Updated last year
fpgasystems / Chameleon-RAG-Acceleration
View on GitHub
☆23Jun 1, 2025Updated last year
TNAS-DCS / TNAS-DCS
View on GitHub
☆13Aug 9, 2022Updated 3 years ago
SJTU-IPADS / MetaAttention
View on GitHub
MetaAttention: A Unified and Performant Attention Framework Across Hardware Backends(PPoPP'26)
☆16Dec 31, 2025Updated 6 months ago
radixark / miles_diffusion
View on GitHub
[Experimental] Miles-diffusion is an post-training framework for large-scale diffusion model training and production workloads, forked fr…
☆22Updated this week
oliverYoung2001 / UltraAttn
View on GitHub
SC'25 UltraAttn: Efficiently Parallelizing Attention through Hierarchical Context-Tiling
☆16Aug 14, 2025Updated 11 months ago
wangshicheng1225 / LoRDMA
View on GitHub
☆13Oct 21, 2023Updated 2 years ago
Scientific-Computing-Lab / MPI-rigen
View on GitHub
MPI Code Generation through Domain-Specific Language Models
☆16Nov 19, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
shengshu-ai / TurboServe
View on GitHub
TurboServe: Serving Streaming Video Generation Efficiently and Economically
☆34Jul 12, 2026Updated last week
Froot-NetSys / Arya
View on GitHub
Arya: Arbitrary Graph Pattern Mining with Decomposition-based Sampling
☆18Sep 27, 2023Updated 2 years ago
mediroozmeh / FPGA_BitonicSorting
View on GitHub
Implementation of BitonicSorting algorithm on FPGA through SDAccel using Opencl as source code
☆17Nov 21, 2016Updated 9 years ago
Trinity-data-store / Trinity
View on GitHub
EuroSys '24: "Trinity: A Fast Compressed Multi-attribute Data Store"
☆18Mar 8, 2025Updated last year
uw-syfi / TraceLab
View on GitHub
An open toolkit and public dataset hub for collecting, sanitizing, analyzing, and visualizing coding agent traces.
☆50Jul 2, 2026Updated 2 weeks ago
google / rago
View on GitHub
☆31Jun 22, 2025Updated last year
cyhdmjzzy / DeepEP-Code-Analysis
View on GitHub
☆26Feb 27, 2026Updated 4 months ago
thomaschlt / mla.c
View on GitHub
Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.
☆18Jan 15, 2025Updated last year
iHeartGraph / Enterprise
View on GitHub
Enterprise: Breadth-First Graph Traversal on GPUs. SC'15.
☆33May 20, 2017Updated 9 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
AlbertoParravicini / approximate-spmv-topk
View on GitHub
Public repostory for the DAC 2021 paper "Scaling up HBM Efficiency of Top-K SpMV forApproximate Embedding Similarity on FPGAs"
☆16Aug 29, 2021Updated 4 years ago
SNU-ARC / atc21-asap-kernel
View on GitHub
☆15Jan 24, 2022Updated 4 years ago
SLIT-AI / WRPO
View on GitHub
[ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion
☆14Mar 17, 2025Updated last year
DominikHorn / hashing-benchmark
View on GitHub
benchmark driver for "Can Learned Models Replace Hash Functions?" VLDB submission
☆16Oct 31, 2023Updated 2 years ago
Terra-Flux / PolyRL
View on GitHub
[NSDI'26] PolyRL is a reinforcement learning framework for LLM that harvest spot instances on the cloud to reduce cost.
☆19Mar 30, 2026Updated 3 months ago
amazon-science / comm-prompt
View on GitHub
CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))
☆16Apr 26, 2024Updated 2 years ago
horizon-research / Efficient-Deep-Learning-for-Point-Clouds
View on GitHub
☆49Apr 22, 2021Updated 5 years ago