eunomia-bpf / eGPULinks
Extending eBPF Programmability and Observability to GPUs (merged into https://github.com/eunomia-bpf/bpftime)
☆267Updated last week
Alternatives and similar repositories for eGPU
Users that are interested in eGPU are comparing it to the libraries listed below
Sorting:
- CXL remote offloading data movement aware compiler☆30Updated last month
- Heterogeneous Containerization of Large Language Model Apps☆107Updated 3 months ago
- UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g…☆1,066Updated last week
- [Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models☆1,148Updated last month
- Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware☆59Updated 3 weeks ago
- Some Hardware Architectures for GEMM☆283Updated 6 months ago
- PTX on XPUs☆109Updated last week
- A Tiny structure of pytorch for learning;☆60Updated last year
- Fastest bloom filter in C++/Go/Rust/Java/C#☆109Updated 7 months ago
- JittorGeometric is a Jittor-based graph machine learning library.☆395Updated 2 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆195Updated last year
- 🧠 Prometheus: A Knowledge-Graph-Driven 🤖 AI Agent that maps 🗺, understands 🧩, and repairs 🛠 complex codebases — not by guessing, but…☆403Updated last week
- YiTu is an easy-to-use runtime to fully exploit the hybrid parallelism of different hardwares (e.g., GPU) to efficiently support the exec…☆254Updated 6 months ago
- The Next-Gen Database for AI—an infrastructure designed for data and AI. As the MySQL of the AI era.☆108Updated this week
- ☆23Updated last year
- [NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems☆95Updated 3 weeks ago
- Unified KV Cache Compression Methods for Auto-Regressive Models☆1,277Updated 10 months ago
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆238Updated last year
- Code Efficiency Benchmark☆85Updated 6 months ago
- Remote IDA Call, a python package that allows you to call IDA functions from a remote process.☆118Updated last month
- ☆166Updated this week
- ☆117Updated this week
- [NeurIPS 2025] Accelerating Parallel Diffusion Model Serving with Residual Compression☆39Updated last month
- Step-by-step optimization of TPU MatMul Kernels☆85Updated 3 months ago
- DrCCTProf is a fine-grained call path profiling framework for binaries running on ARM and X86 architectures.☆122Updated 2 years ago
- Repo for paper *Measuring and Augmenting Large Language Models for Solving Capture-the-Flag Challenges*☆285Updated 4 months ago
- ☆171Updated this week
- TVM Documentation in Chinese Simplified / TVM 中文文档☆2,653Updated 2 weeks ago
- A reading list for MLSecOps!☆142Updated 8 months ago
- MTLA: Multi-head Temporal Latent Attention☆758Updated last month