EveryInc / AI_DiplomacyLinks

Frontier Models playing the board game Diplomacy.

☆522

Alternatives and similar repositories for AI_Diplomacy

Users that are interested in AI_Diplomacy are comparing it to the libraries listed below

Sorting:

NousResearch / atropos
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …
☆547Updated this week
arcprize / arc-agi-benchmarking
Testing baseline LLMs performance across various models
☆288Updated last week
aidanmclaughlin / AidanBench
Aidan Bench attempts to measure <big_model_smell> in LLMs.
☆306Updated 3 weeks ago
NousResearch / Open-Reasoning-Tasks
A comprehensive repository of reasoning tasks for LLMs (and beyond)
☆447Updated 9 months ago
NeoVertex1 / ComplexTensor
ComplexTensor: Machine Learning By Bridging Classical and Quantum Computation
☆76Updated 8 months ago
open-thought / reasoning-gym
procedural reasoning datasets
☆960Updated 2 weeks ago
anthropic-experimental / agentic-misalignment
☆319Updated last month
SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆316Updated 9 months ago
SakanaAI / self-adaptive-llms
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,126Updated 5 months ago
open-thought / system-2-research
System 2 Reasoning Link Collection
☆844Updated 4 months ago
SakanaAI / text-to-lora
Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
☆807Updated last month
adobe-research / dynasaur
Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"
☆346Updated 7 months ago
NousResearch / DisTrO
Distributed Training Over-The-Internet
☆946Updated 2 months ago
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆808Updated this week
nuwandavek / karpathify
☆92Updated 9 months ago
pranavjad / mlx-gpt2
gpt-2 from scratch in mlx
☆391Updated last year
xjdr-alt / entropix-local
smol models are fun too
☆93Updated 8 months ago
haizelabs / verdict
Inference-time scaling for LLMs-as-a-judge.
☆251Updated last week
simple-bench / SimpleBench
☆135Updated 7 months ago
arcprize / ARC-AGI-2
☆377Updated last month
neoneye / ARC-Interactive-History-Dataset
The history files when recording human interaction while solving ARC tasks
☆113Updated this week
MarioSieg / magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
☆555Updated 2 weeks ago
SakanaAI / treequest
A Tree Search Library with Flexible API for LLM Inference-Time Scaling
☆404Updated last week
doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆97Updated 2 months ago
yoheinakajima / babyagi-2o
the simplest self-building general autonomous agent
☆315Updated 9 months ago
willccbb / verifiers
Verifiers for LLM Reinforcement Learning
☆1,543Updated this week
PsycheFoundation / psyche
An open infrastructure to democratize and decentralize the development of superintelligence for humanity.
☆429Updated this week
willccbb / claude-deep-research
Claude Deep Research config for Claude Code.
☆196Updated 4 months ago
PrimeIntellect-ai / prime
prime is a framework for efficient, globally distributed training of AI models over the internet.
☆781Updated 2 months ago
groundlight / r1_vlm
Build your own visual reasoning model
☆395Updated last week