Where GPUs get cooked π©βπ³π₯
β395May 26, 2026Updated 2 weeks ago
Alternatives and similar repositories for gpu-fryer
Users that are interested in gpu-fryer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Minimalistic large language model 3D-parallelism trainingβ2,711May 26, 2026Updated 2 weeks ago
- Minimalistic 4D-parallelism distributed training framework for education purposeβ2,216Aug 26, 2025Updated 9 months ago
- β18Dec 2, 2024Updated last year
- Efficient Triton Kernels for LLM Trainingβ6,415Updated this week
- Save, load, host, and share AI model checkpoints without slowing down training. Host on Lightning AI or your own cloud with enterprise-grβ¦β40Jun 1, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Multi-GPU CUDA stress testβ2,217May 31, 2026Updated last week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ2,437May 29, 2026Updated last week
- FM-Leaderboard-er allows you to create leaderboard to find the best LLM/prompt for your own business use case based on your data, task, pβ¦β19Oct 31, 2024Updated last year
- Cray-LM unified training and inference stack.β22Jan 30, 2025Updated last year
- A lattice QCD library.β17Updated this week
- A Datacenter Scale Distributed Inference Serving Frameworkβ7,200Updated this week
- Minimal implementation of multiple PEFT methods for LLaMA fine-tuningβ13May 7, 2023Updated 3 years ago
- Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.β437Updated this week
- Pragmatic approach to parsing import profiles for CI'sβ12Jul 1, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Parseit - Parseit is command line tool to parse data using EBNF or ABNF using the excellent Instaparse library, and serializing the resulβ¦β16Dec 5, 2022Updated 3 years ago
- JAX Scalify: end-to-end scaled arithmeticsβ18Oct 30, 2024Updated last year
- Implementation for robust ViT and scaled attentionβ21Apr 4, 2025Updated last year
- Inference server benchmarking toolβ160May 26, 2026Updated 2 weeks ago
- Recipes to scale inference-time compute of open modelsβ1,132May 26, 2026Updated 2 weeks ago
- Build compute kernels and load them from the Hub.β676Updated this week
- This repository contains the results and code for the MLPerfβ’ Training v4.0 benchmark.β12Jun 11, 2024Updated last year
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IPβ153Sep 12, 2025Updated 8 months ago
- Tile primitives for speedy kernelsβ3,405May 27, 2026Updated last week
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- β28Jun 2, 2026Updated last week
- MoE training for Me and You and maybe other peopleβ386Mar 15, 2026Updated 2 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.β94May 28, 2026Updated last week
- β54Sep 26, 2025Updated 8 months ago
- Simple MPI implementation for prototyping or learningβ316Aug 6, 2025Updated 10 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β506Updated this week
- A high-performance distributed file system designed to address the challenges of AI training and inference workloads.β9,952May 7, 2026Updated last month
- Collection of small examples for running on ALCF resourcesβ23May 29, 2026Updated last week
- CUDA checkpoint and restore utilityβ459Sep 15, 2025Updated 8 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.β42Apr 4, 2025Updated last year
- Speed up model training by fixing data loading.β598Jun 2, 2026Updated last week
- A peer to peer machine intelligence benchmarkβ31Mar 24, 2023Updated 3 years ago
- Everything about the SmolLM and SmolVLM family of modelsβ3,805May 26, 2026Updated 2 weeks ago
- β29May 26, 2026Updated 2 weeks ago
- [NeurIPS 2025] Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang Lβ¦β71Mar 3, 2026Updated 3 months ago
- Easily run PyTorch on multiple GPUs & machinesβ60May 2, 2026Updated last month