GoodStartLabs / AI_DiplomacyLinks
Frontier Models playing the board game Diplomacy.
☆584Updated last month
Alternatives and similar repositories for AI_Diplomacy
Users that are interested in AI_Diplomacy are comparing it to the libraries listed below
Sorting:
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆702Updated this week
- Testing baseline LLMs performance across various models☆313Updated this week
- Async RL Training at Scale☆690Updated this week
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆311Updated 3 months ago
- ☆166Updated 9 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆449Updated last year
- ☆469Updated 3 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,168Updated last week
- Provider-agnostic, open-source evaluation infrastructure for language models☆587Updated this week
- ☆999Updated this week
- The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers☆97Updated this week
- Build your own visual reasoning model☆412Updated last month
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆818Updated 2 months ago
- ComplexTensor: Machine Learning By Bridging Classical and Quantum Computation☆77Updated 10 months ago
- Training-Ready RL Environments + Evals☆121Updated this week
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆881Updated 4 months ago
- The history files when recording human interaction while solving ARC tasks☆116Updated last week
- ☆508Updated 2 months ago
- ☆475Updated 2 months ago
- Post-training with Tinker☆912Updated this week
- System 2 Reasoning Link Collection☆855Updated 6 months ago
- ShellSage saves sysadmins’ sanity by solving shell script snafus super swiftly☆366Updated this week
- Build hours code to share.☆566Updated last month
- ☆92Updated last year
- gpt-2 from scratch in mlx☆399Updated last year
- Inference-time scaling for LLMs-as-a-judge.☆300Updated last week
- ☆478Updated 4 months ago
- The Autograd Engine☆636Updated last year
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆324Updated 11 months ago
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,684Updated 2 months ago