openai / preparedness
Releases from OpenAI Preparedness
☆736Updated this week
Alternatives and similar repositories for preparedness
Users that are interested in preparedness are comparing it to the libraries listed below
Sorting:
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆517Updated 2 months ago
- Dream 7B, a large diffusion language model☆630Updated 2 weeks ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆703Updated last week
- Atom of Thoughts for Markov LLM Test-Time Scaling☆563Updated this week
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆492Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆455Updated last week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,109Updated 3 months ago
- An agent benchmark with tasks in a simulated software company.☆350Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆327Updated 5 months ago
- Verifiers for LLM Reinforcement Learning☆953Updated this week
- ☆527Updated last month
- Understanding R1-Zero-Like Training: A Critical Perspective☆925Updated last month
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,058Updated 3 months ago
- ⚖️ The First Coding Agent-as-a-Judge☆484Updated this week
- ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆847Updated 2 weeks ago
- Pretraining code for a large-scale depth-recurrent language model☆760Updated last month
- The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search☆1,128Updated last week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆357Updated this week
- This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software E…☆1,372Updated last month
- TTRL: Test-Time Reinforcement Learning☆488Updated 2 weeks ago
- CodeScientist: An automated scientific discovery system for code-based experiments☆248Updated last month
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆863Updated this week
- procedural reasoning datasets☆580Updated this week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆352Updated last week
- Testing baseline LLMs performance across various models☆260Updated last week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,813Updated this week
- Seed-Coder is a family of open-source code LLMs comprising base, instruct and reasoning models of 8B size, developed by ByteDance Seed.☆183Updated last week
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆261Updated this week
- [ICML 2025 Spotlight] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆523Updated last week
- LIMO: Less is More for Reasoning☆940Updated last month