meta-pytorch / OpenEnvLinks
An interface library for RL post training with environments.
☆66Updated this week
Alternatives and similar repositories for OpenEnv
Users that are interested in OpenEnv are comparing it to the libraries listed below
Sorting:
- Async RL Training at Scale☆722Updated this week
- ☆229Updated 4 months ago
- Training-Ready RL Environments + Evals☆132Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆726Updated this week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆463Updated 2 months ago
- rl from zero pretrain, can it be done? yes.☆277Updated 3 weeks ago
- Simple & Scalable Pretraining for Neural Architecture Research☆297Updated 2 months ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆296Updated 2 months ago
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆277Updated this week
- Post-training with Tinker☆1,096Updated this week
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆209Updated this week
- Train your own SOTA deductive reasoning model☆108Updated 7 months ago
- PyTorch-native post-training at scale☆83Updated last week
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆560Updated 2 months ago
- An extension of the nanoGPT repository for training small MOE models.☆202Updated 7 months ago
- ☆105Updated this week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆187Updated 7 months ago
- ☆135Updated 7 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆406Updated 3 weeks ago
- ☆843Updated last week
- Tina: Tiny Reasoning Models via LoRA☆299Updated last month
- RLP: Reinforcement as a Pretraining Objective☆192Updated 2 weeks ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆260Updated this week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆344Updated 5 months ago
- Open source interpretability artefacts for R1.☆163Updated 6 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆342Updated 10 months ago
- ☆222Updated 3 weeks ago
- ☆107Updated last month
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆346Updated 4 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆204Updated last week