☆26Aug 19, 2022Updated 3 years ago
Alternatives and similar repositories for HUVM
Users that are interested in HUVM are comparing it to the libraries listed below
Sorting:
- ☆33Sep 9, 2020Updated 5 years ago
- PyTorch-UVM on super-large language models.☆17Dec 21, 2020Updated 5 years ago
- Official code repository for "CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics [USENIX ATC 22]"☆18Sep 19, 2024Updated last year
- [IEEE CAL 2025] Accelerating Page Migrations in Operating Systems with Intel DSA☆16Nov 20, 2024Updated last year
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆15Dec 21, 2020Updated 5 years ago
- [USENIX ATC 2021] Exploring the Design Space of Page Management for Multi-Tiered Memory Systems☆48Mar 31, 2022Updated 3 years ago
- ☆80Nov 16, 2020Updated 5 years ago
- [ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Aug 6, 2025Updated 7 months ago
- Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration☆36Jan 8, 2026Updated 2 months ago
- Secure Inference Resilient Against Malicious Clients☆14May 3, 2022Updated 3 years ago
- ☆13Oct 6, 2024Updated last year
- GVProf: A Value Profiler for GPU-based Clusters☆53Mar 24, 2024Updated last year
- GPU Performance Advisor☆66Jul 25, 2022Updated 3 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- ☆16Feb 27, 2022Updated 4 years ago
- ☆18May 8, 2021Updated 4 years ago
- ☆14Jan 12, 2022Updated 4 years ago
- This serves as a repository for reproducibility of the SC21 paper "In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated…☆39Sep 25, 2023Updated 2 years ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆12Mar 7, 2024Updated 2 years ago
- ☆36Jun 10, 2024Updated last year
- ☆41Sep 19, 2023Updated 2 years ago
- Enabling pure data parallel training of DLRM via caching and prefetching☆17Oct 29, 2021Updated 4 years ago
- ☆19Aug 26, 2021Updated 4 years ago
- ☆38Jun 27, 2025Updated 8 months ago
- LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism☆89Dec 24, 2021Updated 4 years ago
- [DATE'23] The official code for paper <CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory>☆23Jan 19, 2026Updated last month
- CasHMC: A Cycle-accurate Simulator for Hybrid Memory Cube☆23Aug 10, 2018Updated 7 years ago
- CUDAAdvisor: a GPU profiling tool☆52Aug 24, 2018Updated 7 years ago
- Johnny Cache: the End of DRAM Cache Conflicts (in Tiered Main Memory Systems)☆20Aug 2, 2023Updated 2 years ago
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 7 months ago
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 7 months ago
- Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support…☆50Aug 21, 2018Updated 7 years ago
- Pond: CXL-Based Memory Pooling Systems for Cloud Platforms (ASPLOS'23)☆220Oct 13, 2024Updated last year
- Arbitrary offloads for RDMA NICs☆99Apr 25, 2022Updated 3 years ago
- Cheddar: A Swift Fully Homomorphic Encryption (FHE) GPU Library☆48Jan 14, 2026Updated last month
- rFaaS: a high-performance FaaS platform with RDMA acceleration for low-latency invocations.☆58Jul 7, 2025Updated 8 months ago
- ☆31May 31, 2023Updated 2 years ago
- Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions☆164Apr 21, 2019Updated 6 years ago
- ☆37Sep 3, 2025Updated 6 months ago