PrimeIntellect-ai / prime-environmentsLinks

Curated collection of community environments

☆204

Alternatives and similar repositories for prime-environments

Users that are interested in prime-environments are comparing it to the libraries listed below

Sorting:

tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆286Updated 3 months ago
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆167Updated 9 months ago
TextArena / UnstableBaselines
☆116Updated last week
PrimeIntellect-ai / genesys
☆136Updated 10 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆231Updated last month
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆175Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 10 months ago
PrimeIntellect-ai / prime
Official CLI and Python SDK for Prime Intellect - access GPU compute, remote sandboxes, RL environments, and distributed training infrast…
☆138Updated this week
PrimeIntellect-ai / prime-rl
Async RL Training at Scale
☆1,005Updated this week
ScalingIntelligence / Archon
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
☆190Updated 10 months ago
laude-institute / harbor
Harbor is a framework for running agent evaluations and creating and using RL environments.
☆438Updated this week
haizelabs / verdict
Inference-time scaling for LLMs-as-a-judge.
☆325Updated 2 months ago
google-deepmind / latent-multi-hop-reasoning
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
☆88Updated 10 months ago
haizelabs / Awesome-LLM-Judges
⚖️ Awesome LLM Judges ⚖️
☆148Updated 8 months ago
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆65Updated 8 months ago
google-deepmind / mishax
☆151Updated 4 months ago
TextArena / TextArena
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
☆342Updated 2 weeks ago
pyember / ember
☆237Updated 2 weeks ago
nano-R1 / resources
Compiling useful links, papers, benchmarks, ideas, etc.
☆46Updated 10 months ago
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆306Updated last month
brendanhogan / picoDeepResearch
☆68Updated 7 months ago
NousResearch / atropos
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …
☆831Updated this week
VsonicV / es-fine-tuning-paper
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
☆285Updated last month
thinking-machines-lab / tinker
Training API and CLI
☆318Updated this week
haizelabs / j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
☆101Updated 6 months ago
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆252Updated 4 months ago
METR / RE-Bench
☆130Updated 3 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆259Updated this week
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆234Updated 6 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆109Updated 10 months ago