groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆38Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for mlagility
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆35Updated 6 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆50Updated this week
- Memory Optimizations for Deep Learning (ICML 2023)☆60Updated 8 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- Fast Inference of MoE Models with CPU-GPU Orchestration☆172Updated this week
- ☆55Updated 5 months ago
- Collection of kernels written in Triton language☆68Updated 3 weeks ago
- python package of rocm-smi-lib☆18Updated last month
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆100Updated 3 weeks ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated last month
- Example of applying CUDA graphs to LLaMA-v2☆10Updated last year
- MLPerf™ logging library☆30Updated this week
- Efficient, Flexible and Portable Structured Generation☆53Updated this week
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆87Updated last month
- Repository for CPU Kernel Generation for LLM Inference☆25Updated last year
- Simple and fast low-bit matmul kernels in CUDA / Triton☆145Updated this week
- ☆67Updated last week
- ☆19Updated 8 months ago
- ☆99Updated last month
- Prototype routines for GPU quantization written using PyTorch.☆19Updated last week
- ☆45Updated 2 weeks ago