fpgasystems/Chameleon-RAG-Acceleration

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fpgasystems/Chameleon-RAG-Acceleration)

fpgasystems / Chameleon-RAG-Acceleration

☆19

Alternatives and similar repositories for Chameleon-RAG-Acceleration

Users that are interested in Chameleon-RAG-Acceleration are comparing it to the libraries listed below

Sorting:

yinpeiqi / Gorgeous
View on GitHub
☆30Sep 13, 2025Updated 5 months ago
flashserve / PAT
View on GitHub
Prefix-Aware Attention for LLM Decoding
☆29Jan 23, 2026Updated last month
lzhangbv / acpsgd
View on GitHub
[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
☆10Apr 28, 2023Updated 2 years ago
YZ-Cai / Unified-Navigating-Graph
View on GitHub
Official implementation for paper "Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search"
☆32Dec 21, 2024Updated last year
howarlii / SAQ
View on GitHub
Segmented Code Adjustment Quantization (SAQ)
☆17Sep 22, 2025Updated 5 months ago
HugoZHL / PQCache
View on GitHub
[SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference
☆83Dec 7, 2025Updated 2 months ago
RC4ML / RPCNIC
View on GitHub
RPCNIC: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator [HPCA2025]
☆13Dec 9, 2024Updated last year
thustorage / deft
View on GitHub
Deft: A Scalable Tree Index for Disaggregated Memory
☆23Apr 23, 2025Updated 10 months ago
kazukiosawa / pipe-fisher
View on GitHub
☆10Apr 29, 2023Updated 2 years ago
google / rago
View on GitHub
☆28Jun 22, 2025Updated 8 months ago
AIS-SNU / PathWeaver
View on GitHub
A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search
☆21Jul 22, 2025Updated 7 months ago
wangrunji0408 / rjrouter
View on GitHub
[AFK] Hardware router in Chisel (THU Network Joint Lab 2020)
☆14Oct 8, 2020Updated 5 years ago
InfiniTensor / TinyInfiniTrain
View on GitHub
训练营训练方向项目
☆26Jan 28, 2026Updated last month
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated last year
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
zyqCSL / DiffKV
View on GitHub
☆38Oct 11, 2025Updated 4 months ago
Froot-NetSys / Arya
View on GitHub
Arya: Arbitrary Graph Pattern Mining with Decomposition-based Sampling
☆16Sep 27, 2023Updated 2 years ago
eth-easl / sailor
View on GitHub
AI model training on heterogeneous, geo-distributed resources
☆38Nov 24, 2025Updated 3 months ago
ziliuziliu / FaaSGraph
View on GitHub
☆22Mar 2, 2025Updated last year
liangyuRain / ForestColl
View on GitHub
☆16Apr 22, 2025Updated 10 months ago
vox-serve / vox-serve
View on GitHub
A Streaming-Native Serving Engine for TTS/STS Models
☆56Feb 22, 2026Updated last week
llm-db / FineInfer
View on GitHub
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
☆19May 28, 2024Updated last year
CaucherWang / Steiner-hardness
View on GitHub
A new query hardness measure for graph-based ANN indexes. Build unbiased workloads with this hardness to see the actual performance of yo…
☆22Feb 7, 2025Updated last year
Ying1123 / llm-caching-multiplexing
View on GitHub
☆20Jun 3, 2023Updated 2 years ago
marius-team / quake
View on GitHub
Query-Adaptive Vector Search
☆68Feb 13, 2026Updated 3 weeks ago
metacarbon / shareAtt
View on GitHub
Beyond KV Caching: Shared Attention for Efficient LLMs
☆20Jul 19, 2024Updated last year
dmemsys / Aceso
View on GitHub
This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …
☆22Oct 20, 2024Updated last year
byteps / examples
View on GitHub
BytePS examples (Vision, NLP, GAN, etc)
☆19Nov 24, 2022Updated 3 years ago
PKUZHOU / NeoMem-MICRO-2024
View on GitHub
The Artifact of NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering
☆63Aug 11, 2024Updated last year
mti-lab / rnn-descent
View on GitHub
☆24Apr 4, 2024Updated last year
S-Lab-System-Group / Hydro
View on GitHub
Surrogate-based Hyperparameter Tuning System
☆29Jun 29, 2023Updated 2 years ago
google / iopddl
View on GitHub
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆25May 12, 2025Updated 9 months ago
RLsys-Foundation / APRIL
View on GitHub
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…
☆51Oct 11, 2025Updated 4 months ago
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
cornell-zhang / llm-datatypes
View on GitHub
Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
☆27Jun 25, 2024Updated last year
ucbrise / snoopy
View on GitHub
A high-throughput oblivious storage system
☆28May 31, 2023Updated 2 years ago
thustorage / PipeANN
View on GitHub
A low-latency, billion-scale, and updatable graph-based vector store on SSD.
☆99Feb 4, 2026Updated last month
phonism / genesis
View on GitHub
Gensis is a lightweight deep learning framework written from scratch in Python, with Triton as its backend for high-performance computing…
☆37Jan 15, 2026Updated last month
zilliztech / BBAnn
View on GitHub
Block-based Approximate Nearest Neighbor
☆35Nov 1, 2021Updated 4 years ago