noambrown / poker_solverLinks
A no-limit Texas hold'em river solver using CFR variants
☆120Updated 3 weeks ago
Alternatives and similar repositories for poker_solver
Users that are interested in poker_solver are comparing it to the libraries listed below
Sorting:
- Official CLI and Python SDK for Prime Intellect - access GPU compute, remote sandboxes, RL environments, and distributed training infrast…☆143Updated this week
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆249Updated 2 weeks ago
- ☆313Updated last month
- look how they massacred my boy☆63Updated last year
- Digital Red Queen: Adversarial Program Evolution in Core War with LLMs☆158Updated 2 weeks ago
- explore token trajectory trees on instruct and base models☆150Updated 7 months ago
- Streamline on-policy/off-policy distillation workflows in a few lines of code☆93Updated this week
- ☆159Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 3 months ago
- EXO Gym is an open-source Python toolkit that facilitates distributed AI research.☆94Updated last month
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆101Updated 6 months ago
- Claude Deep Research config for Claude Code.☆225Updated 10 months ago
- ☆68Updated 8 months ago
- Pivotal Token Search☆142Updated last month
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆85Updated 5 months ago
- Storing long contexts in tiny caches with self-study☆231Updated last month
- The State Of The Art, intelligence☆157Updated 5 months ago
- A graph visualization of attention☆57Updated 8 months ago
- Curated collection of community environments☆205Updated last week
- Data recipes and robust infrastructure for training AI agents☆84Updated this week
- Simple & Scalable Pretraining for Neural Architecture Research☆306Updated last month
- rl from zero pretrain, can it be done? yes.☆286Updated 3 months ago
- Ludic – an LLM-RL library for the era of experience☆54Updated 2 weeks ago
- Serverless Posttraining☆68Updated this week
- Marketplace ML experiment - training without backprop☆27Updated 4 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆127Updated 3 months ago
- ☆482Updated 6 months ago
- ☆73Updated 3 weeks ago
- Lego for GRPO☆30Updated 8 months ago
- A framework for optimizing DSPy programs with RL☆305Updated 2 weeks ago