groq / mlagilityLinks
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆39Updated last month
Alternatives and similar repositories for mlagility
Users that are interested in mlagility are comparing it to the libraries listed below
Sorting:
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆43Updated 3 months ago
- Explore training for quantized models☆18Updated this week
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆61Updated 5 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Memory Optimizations for Deep Learning (ICML 2023)☆64Updated last year
- ☆72Updated 3 months ago
- High-Performance SGEMM on CUDA devices☆95Updated 5 months ago
- Cray-LM unified training and inference stack.☆22Updated 4 months ago
- ☆12Updated 3 years ago
- ☆50Updated last year
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆61Updated last week
- ☆68Updated this week
- ☆21Updated 3 months ago
- LLM training in simple, raw C/CUDA☆99Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Benchmarks to capture important workloads.☆31Updated 4 months ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated 6 months ago
- python package of rocm-smi-lib☆21Updated 9 months ago
- Notes and artifacts from the ONNX steering committee☆26Updated 2 weeks ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆46Updated this week
- MLPerf™ logging library☆36Updated 2 months ago
- Example ML projects that use the Determined library.☆32Updated 9 months ago
- PB-LLM: Partially Binarized Large Language Models☆152Updated last year
- Open Source Projects from Pallas Lab☆20Updated 3 years ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆107Updated last month
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆109Updated 8 months ago
- ML/DL Math and Method notes☆61Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆26Updated last year
- ☆45Updated last year
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆40Updated 11 months ago