SlugLab / CXLMemSimLinks
CXLMemSim: A pure software simulated CXL.mem for performance characterization
☆599Updated this week
Alternatives and similar repositories for CXLMemSim
Users that are interested in CXLMemSim are comparing it to the libraries listed below
Sorting:
- CXL remote offloading data movement aware compiler☆72Updated last month
- Extending eBPF Programmability and Observability to GPUs (merged into https://github.com/eunomia-bpf/bpftime)☆290Updated 2 months ago
- Some Hardware Architectures for GEMM☆287Updated 8 months ago
- UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g…☆1,195Updated this week
- ☆140Updated 6 months ago
- ☆24Updated last year
- Official implementation of "REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving" (NeurIPS 2025)☆99Updated 2 months ago
- Heterogeneous Containerization of Agents☆109Updated 6 months ago
- Hybrid-tier key-value storage engine built on object storage & local SSDs. Engineered for batch-write efficiency and read optimization wi…☆238Updated last week
- [Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models☆1,174Updated 3 months ago
- A distributed framework for LLM agents☆447Updated 3 weeks ago
- [NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems☆127Updated 3 months ago
- YiRage (Yield Revolutionary AGile Engine) - Multi-Backend LLM Inference Optimization. Extends Mirage with comprehensive support for CUDA,…☆37Updated last week
- YiTu is an easy-to-use runtime to fully exploit the hybrid parallelism of different hardwares (e.g., GPU) to efficiently support the exec…☆254Updated 3 weeks ago
- High Performance Distributed Database with MySQL Compatible API, Great Scalability, Full ACID Distributed Transactions, and Tiered S3 Sto…☆448Updated last week
- Source code of our paper for our paper: GPU-Accelerated Batch-Dynamic Subgraph Matching☆13Updated 2 years ago
- LLM Serving simulation for multi-core NPU☆113Updated last month
- Crypto DRL trading DEMO☆29Updated 2 years ago
- 2025华为软件精英挑战赛 总决赛最佳大模型应用奖☆38Updated 9 months ago
- The Next-Gen Database for AI—an infrastructure designed for data and AI. As the MySQL of the AI era.☆160Updated last week
- Step-by-step optimization of TPU MatMul Kernels☆85Updated 6 months ago
- ☆260Updated this week
- Source code for the paper "A Lightweight Framework for Fast Trajectory Simplification".☆67Updated 10 months ago
- JittorGeometric is a Jittor-based graph machine learning library.☆603Updated 5 months ago
- [NeurIPS 2025] Accelerating Parallel Diffusion Model Serving with Residual Compression☆40Updated 3 months ago
- Code Efficiency Benchmark☆86Updated 9 months ago
- TVM Documentation in Chinese Simplified / TVM 中文文档☆3,185Updated 2 months ago
- A Tiny structure of pytorch for learning;☆61Updated last year
- 没分支的 rCore-Tutorial☆30Updated last year
- A Deep Interest Network (DIN) implementation for MIND News Recommendation with BERT semantic warm-up.☆36Updated 2 months ago