sam-paech/diplobench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sam-paech/diplobench)

sam-paech / diplobench

Benchmark for LLMs playing full press diplomacy

☆64

Alternatives and similar repositories for diplobench

Users that are interested in diplobench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

prio-data / viewsforecasting
View on GitHub
Jupyter notebooks and python scripts for performing the ViEWS monthly forecasts
☆16Nov 26, 2025Updated 7 months ago
KaiXIIM / dipllm
View on GitHub
This is the official implementation of the paper "DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy".
☆25Dec 19, 2025Updated 7 months ago
openredact / expose-text
View on GitHub
This is a prototype of a Python module for simple modification of document files. ➡️ The project has moved to: https://gitlab.opencode.de…
☆19Mar 20, 2026Updated 4 months ago
JoshuaPurtell / LRCBench
View on GitHub
Evals meant to evaluate language models' ability to reason over long contexts.
☆10Sep 12, 2024Updated last year
MeLeLBGU / SaGe
View on GitHub
Code for SaGe subword tokenizer (EACL 2023)
☆28Nov 30, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ericyuegu / hal
View on GitHub
Training AI for Super Smash Bros. Melee
☆36Updated this week
aisa-group / promptinject-agent-skills
View on GitHub
Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections
☆21Jul 2, 2026Updated 2 weeks ago
gsbDBI / contextual_bandits_evaluation
View on GitHub
Offline Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
☆11Oct 21, 2024Updated last year
Mia-Cong / SWIFT
View on GitHub
Official implementation of "Can Test-Time Scaling Improve World Foundation Model?"
☆15Jul 12, 2025Updated last year
bodo-ai / pydough-ce
View on GitHub
PyDough text to analytics: Community Edition
☆28Jun 23, 2026Updated 3 weeks ago
devadigapratham / CoDSPy
View on GitHub
An intelligent code optimization system leveraging AI analysis, automated refactoring, and test generation. Built with DSPy and Gradio, i…
☆19Feb 1, 2025Updated last year
JayZeeDesign / claude-code-proxy
View on GitHub
Run Claude Code on OpenAI models
☆20Jul 13, 2025Updated last year
junk16 / spark-yarn-cluster
View on GitHub
Apache Spark on Apache Yarn 2.6.0 cluster Docker image
☆12Oct 18, 2017Updated 8 years ago
davidberenstein1957 / dataset-viber
View on GitHub
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Sep 5, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
jhbertra / purescript-reactive-effect
View on GitHub
Higher-order FRP for PureScript
☆13Dec 24, 2022Updated 3 years ago
obra / external-subagents
View on GitHub
Codex subagent CLI and tooling
☆24Dec 24, 2025Updated 6 months ago
the-sett / ai-search
View on GitHub
ai-search
☆15May 14, 2019Updated 7 years ago
purescript-web / purescript-web-socket
View on GitHub
Type definitions and low level interface implementations for the W3C WebSocket API
☆12Apr 27, 2022Updated 4 years ago
purefunctor / purescript-ssrs
View on GitHub
Stack-safe recursion schemes on dissectible data structures.
☆13May 6, 2022Updated 4 years ago
Pleias / OCRoscope
View on GitHub
Small python package to measure OCR quality and other related metrics.
☆26Feb 19, 2024Updated 2 years ago
Globe-Engineer / handkerchief
View on GitHub
Globe Engineer - Handkerchief: A higher quality alternative to vector database RAG.
☆24Jan 4, 2024Updated 2 years ago
strangeloopcanon / llm-poker
View on GitHub
Evaluating LLMs by having them play games against each other
☆23Sep 9, 2025Updated 10 months ago
rowtype-yoga / purescript-yoga-om
View on GitHub
☆13Jul 14, 2026Updated last week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Infatoshi / craftax.c
View on GitHub
Pure C / AVX-512 port of Craftax-Classic. 47.8M SPS on a Ryzen 9 9950X3D -- 3.2x an RTX Pro 6000 Blackwell on the same env.
☆24Jul 14, 2026Updated last week
smatting / blazing-tabs
View on GitHub
Blazing Tabs is a browser extension that allows you to search and switch your tabs blazingly fast.
☆13Mar 6, 2025Updated last year
pandaant / poker-mcts
View on GitHub
A NLTH Poker Agent using Monte-Carlo-Simulation
☆14Jun 8, 2020Updated 6 years ago
VictorTaelin / ab_challenge_eval
View on GitHub
Evaluator for the A::B Prompting Challenge
☆28Apr 10, 2024Updated 2 years ago
i-am-the-slime / purescript-react-testing-library
View on GitHub
Provides a Purescript wrapper around react-testing-library to be used with purescript-react-basic-hooks
☆15May 27, 2026Updated last month
buttonize / buttonize
View on GitHub
🎨 All you need to hook-up UI components directly to your AWS Lambda functions. Just install Buttonize and deploy your CDK. That's it.
☆14May 23, 2024Updated 2 years ago
camenduru / daclip-uir-colab
View on GitHub
☆13Oct 12, 2023Updated 2 years ago
skywalker023 / thought-tracing
View on GitHub
🚲 Code and benchmark for our COLM 2025 paper - "Thought Tracing: Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models"
☆15Aug 8, 2025Updated 11 months ago
chenandrewy / Prompts-to-Paper
View on GitHub
☆20Apr 10, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Globe-Engineer / semantic-commit
View on GitHub
☆19Sep 12, 2024Updated last year
adamkarvonen / SAE_BoardGameEval
View on GitHub
☆25Jan 28, 2025Updated last year
purefunctor / psvm-ps
View on GitHub
PureScript version management in PureScript.
☆14Jan 27, 2023Updated 3 years ago
rowtype-yoga / purescript-record-studio
View on GitHub
📀 You finally scored a record deal.
☆11Apr 11, 2023Updated 3 years ago
SkunkworksAI / CodeFusion
View on GitHub
☆14Oct 31, 2023Updated 2 years ago
alpacaaa / purescript-simplecrypto
View on GitHub
A set of useful cryptographic utilities for blockchain development.
☆12Sep 15, 2022Updated 3 years ago
Challenger-XJTU / HiFo-Prompt
View on GitHub
[ICLR 2026] HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design
☆16Feb 7, 2026Updated 5 months ago