EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)
☆31Mar 20, 2026Updated 3 weeks ago
Alternatives and similar repositories for ml-perf-reading-group
Users that are interested in ml-perf-reading-group are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆86Feb 10, 2026Updated 2 months ago
- We aim to redefine Data Parallel libraries portabiliy, performance, programability and maintainability, by using C++ standard features, i…☆51Updated this week
- ☆29Apr 7, 2025Updated last year
- small language models training made easy☆13Dec 15, 2024Updated last year
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× …☆104Sep 8, 2025Updated 7 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆13Updated this week
- ☆49Feb 23, 2025Updated last year
- BASALT Benchmark datasets, evaluation code and agent training example.☆22Nov 29, 2023Updated 2 years ago
- [NOTE] I do not have enough ressources to maintain VMS, please use Ostris's AI-Tookit instead☆43Oct 3, 2025Updated 6 months ago
- ☆17Feb 18, 2026Updated 2 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Feb 27, 2025Updated last year
- pytorch implementation of grok☆12Apr 6, 2026Updated last week
- Exploring how optimizations for GEMMs work☆30Feb 28, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- COCCL: Compression and precision co-aware collective communication library☆30Mar 16, 2025Updated last year
- Mapping out the "memory" of neural nets with data attribution☆53Updated this week
- Making Flux go brrr on GPUs.☆166Jan 5, 2026Updated 3 months ago
- ☆22May 5, 2025Updated 11 months ago
- Applied AI experiments and examples for PyTorch☆320Aug 22, 2025Updated 7 months ago
- A demo for the Direct Ascent Synthesis: Hidden Generative Capabilities in Discriminative Models paper (https://arxiv.org/abs/2502.07753)☆41Mar 5, 2025Updated last year
- Combining Teacache with xDiT to Accelerate Visual Generation Models☆32Apr 21, 2025Updated 11 months ago
- Github mirror of trition-lang/triton repo.☆157Updated this week
- GPU accelerated Perlin Noise in python☆11Oct 23, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Development containers for triton and triton-cpu☆27Updated this week
- DigThatData's Public Brainstorming space☆82Mar 10, 2026Updated last month
- ☆13Jun 7, 2023Updated 2 years ago
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆64Dec 19, 2025Updated 3 months ago
- A collection of generative and training notebooks getting mirrored to google colab.☆12May 29, 2022Updated 3 years ago
- diffusers with search engine☆12Jan 13, 2026Updated 3 months ago
- Collection of scripts to build small-scale datasets for fine-tuning video generation models.☆80Mar 17, 2025Updated last year
- Samples of good AI generated CUDA kernels☆103May 30, 2025Updated 10 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 👷 Build compute kernels☆213Apr 6, 2026Updated last week
- API for coordinating Maintenance in Kubernetes.☆26Jul 18, 2025Updated 9 months ago
- Cataloging released Triton kernels.☆302Sep 9, 2025Updated 7 months ago
- MoE training for Me and You and maybe other people☆380Mar 15, 2026Updated last month
- Optimizing diffusion for production-ready speeds☆39Jan 10, 2026Updated 3 months ago
- ☆18Mar 18, 2024Updated 2 years ago
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 7 months ago