Red-Hat-AI-Innovation-Team / async-grpoLinks
☆37Updated 6 months ago
Alternatives and similar repositories for async-grpo
Users that are interested in async-grpo are comparing it to the libraries listed below
Sorting:
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆52Updated 10 months ago
- ☆213Updated last month
- Training API and CLI☆305Updated 3 weeks ago
- WIP☆93Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- 🧱 Modula software package☆322Updated 4 months ago
- seqax = sequence modeling + JAX☆169Updated 5 months ago
- A Python library for inference-time scaling LLMs☆28Updated this week
- ☆92Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆131Updated last year
- Dion optimizer algorithm☆413Updated last week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆181Updated 6 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆198Updated last year
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆87Updated 10 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆284Updated last month
- Learn online intrinsic rewards from LLM feedback☆45Updated last year
- ☆79Updated last month
- A simple library for scaling up JAX programs☆144Updated 2 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆125Updated 3 months ago
- MoE training for Me and You and maybe other people☆315Updated last week
- ☆213Updated 4 months ago
- Attention Kernels for Symmetric Power Transformers☆128Updated 3 months ago
- Physics of Language Models, Part 4☆291Updated last week
- ☆945Updated 2 months ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆57Updated 11 months ago
- Normalized Transformer (nGPT)☆195Updated last year
- ☆235Updated last week
- Experiment of using Tangent to autodiff triton☆81Updated last year
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆55Updated 6 months ago
- Open-source framework for the research and development of foundation models.☆707Updated this week