UofT-EcoSystem / hotline
☆31Updated last year
Related projects ⓘ
Alternatives and complementary repositories for hotline
- ☆72Updated last year
- nnScaler: Compiling DNN models for Parallel Training☆62Updated 2 weeks ago
- ☆65Updated 3 years ago
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆70Updated 3 years ago
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆56Updated 5 months ago
- FTPipe and related pipeline model parallelism research.☆41Updated last year
- System for automated integration of deep learning backends.☆48Updated 2 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆114Updated 2 years ago
- High performance NCCL plugin for Bagua.☆15Updated 3 years ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆57Updated 6 months ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆49Updated 3 months ago
- ☆11Updated last year
- Automated Parallelization System and Infrastructure for Multiple Ecosystems☆75Updated this week
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated 5 months ago
- ☆21Updated last year
- SOTA Learning-augmented Systems☆32Updated 2 years ago
- ☆8Updated last year
- DietCode Code Release☆61Updated 2 years ago
- Analysis for the traces from byteprofile☆29Updated 11 months ago
- play gemm with tvm☆84Updated last year
- ☆23Updated last year
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆79Updated last year
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆39Updated 2 years ago
- TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.☆148Updated this week
- ☆22Updated 2 years ago
- PyTorch distributed training acceleration framework☆32Updated this week
- ☆33Updated 2 months ago
- HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…☆133Updated 2 months ago
- Compiler for Dynamic Neural Networks☆43Updated 11 months ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆29Updated 3 months ago