casys-kaist / DaCapo
☆17Updated 6 months ago
Alternatives and similar repositories for DaCapo:
Users that are interested in DaCapo are comparing it to the libraries listed below
- ☆66Updated last month
- ☆101Updated last year
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆112Updated 2 months ago
- ☆52Updated 5 months ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆106Updated 3 weeks ago
- [DATE 2023] Pipe-BD: Pipelined Parallel Blockwise Distillation☆11Updated last year
- NEST Compiler☆116Updated 3 months ago
- Experimental deep learning framework written in Rust☆14Updated 2 years ago
- ☆25Updated 2 years ago
- Official Github repository for the SIGCOMM '24 paper "Accelerating Model Training in Multi-cluster Environments with Consumer-grade GPUs"☆71Updated 9 months ago
- ☆24Updated last year
- A version of XRBench-MAESTRO used for MLSys 2023 publication☆23Updated last year
- [HPCA'24] Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System☆44Updated last year
- Study Group of Deep Learning Compiler☆158Updated 2 years ago
- PyTorch CoreSIG☆55Updated 4 months ago
- [ACM EuroSys '23] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Updated last year
- OwLite is a low-code AI model compression toolkit for AI models.☆43Updated 2 months ago
- FriendliAI Model Hub☆92Updated 2 years ago
- Study parallel programming - CUDA, OpenMP, MPI, Pthread☆56Updated 2 years ago
- NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing☆80Updated 10 months ago
- ☆55Updated last year
- Performant kernels for symmetric tensors☆13Updated 8 months ago
- ☆45Updated last year
- ☆36Updated 3 weeks ago
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆113Updated 2 months ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆60Updated last year