☆57Sep 18, 2025Updated 9 months ago
Alternatives and similar repositories for CS854-F24
Users that are interested in CS854-F24 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13May 30, 2024Updated 2 years ago
- ☆91Dec 11, 2019Updated 6 years ago
- ☆18Apr 21, 2024Updated 2 years ago
- Deduplication over dis-aggregated memory for Serverless Computing☆14Mar 21, 2022Updated 4 years ago
- APEX+ is an LLM Serving Simulator☆49Jun 16, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading☆98Jun 16, 2025Updated last year
- A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.☆44Apr 7, 2026Updated 2 months ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆93Oct 15, 2025Updated 8 months ago
- ☆22Apr 2, 2023Updated 3 years ago
- Tempo is a system for declarative, efficient, end-to-end compiled dynamic deep learning☆29Oct 21, 2025Updated 8 months ago
- ☆24Apr 28, 2024Updated 2 years ago
- WHISPER is a comprehensive benchmark suite for emerging persistent memory technologies.☆10May 10, 2017Updated 9 years ago
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆284Mar 6, 2025Updated last year
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This is the open-source site for XFDetector (ASPLOS'20)☆11Mar 5, 2021Updated 5 years ago
- A record of reading list on some MLsys popular topic☆25Mar 20, 2025Updated last year
- [Long Term Support] [SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference☆22Sep 20, 2024Updated last year
- KFunca: A minimalist, high-performance GPU-based automatic differentiation framework☆31Aug 14, 2025Updated 10 months ago
- Arbitrary offloads for RDMA NICs☆100Apr 25, 2022Updated 4 years ago
- ☆152Jun 17, 2026Updated last week
- SJTU CS473 Project: Implementation of Deep Closest Point in TensorFlow, and its comparison with other registration methods.☆13Jun 14, 2020Updated 6 years ago
- Artifacts of EuroSys'24 paper "Exploring Performance and Cost Optimization with ASIC-Based CXL Memory"☆31Feb 21, 2024Updated 2 years ago
- This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …☆24Oct 20, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆24Oct 31, 2023Updated 2 years ago
- Lenovo modifications to Linux memcached for enhanced persistent memory support☆18Nov 4, 2021Updated 4 years ago
- Query-Adaptive Vector Search☆76Mar 19, 2026Updated 3 months ago
- ☆71Feb 13, 2022Updated 4 years ago
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆329Jun 10, 2025Updated last year
- ☆37Updated this week
- Artifacts for ATC '22 paper "Faster Software Packet Processing on FPGA NICs with eBPF Program Warping"☆17May 20, 2022Updated 4 years ago
- system paper reading notes☆256Sep 22, 2025Updated 9 months ago
- ☆67Jun 25, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A caching framework for microservice applications☆24Apr 22, 2024Updated 2 years ago
- An Automated Performance Optimization Framework for P4-Programmable SmartNICs☆28Nov 18, 2023Updated 2 years ago
- Justitia provides RDMA isolation between applications with diverse requirements.☆43May 25, 2022Updated 4 years ago
- ☆1,019Apr 24, 2026Updated 2 months ago
- ⚡ Bring some magic to i.sjtu.edu.cn☆22Jan 3, 2020Updated 6 years ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆92Jul 17, 2025Updated 11 months ago
- ☆10Sep 15, 2023Updated 2 years ago