tensorzero / llmgymLinks
☆29Updated 3 weeks ago
Alternatives and similar repositories for llmgym
Users that are interested in llmgym are comparing it to the libraries listed below
Sorting:
- Lightly-reviewed collection of community environments☆210Updated last week
- ☆237Updated last month
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike stat…☆418Updated 2 weeks ago
- ☆217Updated last week
- Storing long contexts in tiny caches with self-study☆233Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 11 months ago
- Training API and CLI☆325Updated last week
- ☆118Updated 2 weeks ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆133Updated last week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆424Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆261Updated this week
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆313Updated this week
- ☆67Updated 8 months ago
- ☆133Updated 3 months ago
- rl from zero pretrain, can it be done? yes.☆286Updated 4 months ago
- Inference-time scaling for LLMs-as-a-judge.☆327Updated 3 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆88Updated 10 months ago
- ☆137Updated 10 months ago
- Harbor is a framework for running agent evaluations and creating and using RL environments.☆542Updated this week
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆361Updated this week
- ☆59Updated last year
- An interface library for RL post training with environments.☆1,112Updated this week
- ☆33Updated 8 months ago
- Open source interpretability artefacts for R1.☆170Updated 9 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆307Updated 2 months ago
- Ludic – an LLM-RL library for the era of experience☆57Updated 3 weeks ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆102Updated 6 months ago
- Prompts used in the Automated Auditing Blog Post☆137Updated 6 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆112Updated 9 months ago
- PyTorch-native post-training at scale☆613Updated this week