sam-paech / diplobenchLinks
Benchmark for LLMs playing full press diplomacy
☆56Updated 7 months ago
Alternatives and similar repositories for diplobench
Users that are interested in diplobench are comparing it to the libraries listed below
Sorting:
- ☆62Updated 3 months ago
- ☆123Updated last year
- An introduction to LLM Sampling☆79Updated 10 months ago
- A framework for optimizing DSPy programs with RL☆202Updated this week
- look how they massacred my boy☆63Updated last year
- explore token trajectory trees on instruct and base models☆145Updated 4 months ago
- PageRank for LLMs☆50Updated last month
- Train your own SOTA deductive reasoning model☆108Updated 7 months ago
- ☆167Updated 9 months ago
- Simple UI for debugging correlations of text embeddings☆295Updated 4 months ago
- ☆159Updated 10 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆145Updated 8 months ago
- ☆67Updated last year
- A framework for orchestrating AI agents using a mermaid graph☆77Updated last year
- Plotting (entropy, varentropy) for small LMs☆98Updated 5 months ago
- Verbosity control for AI agents☆65Updated last year
- Simple Graph Memory for AI applications☆89Updated 5 months ago
- Inference-time scaling for LLMs-as-a-judge.☆302Updated 2 weeks ago
- ☆40Updated last year
- A graph visualization of attention☆57Updated 5 months ago
- A distributed agent orchestration framework for market agents☆104Updated 2 months ago
- Implementation of the board game Codenames, re-imagined as a collaborative game between LLM agents☆107Updated 7 months ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆197Updated 10 months ago
- ☆124Updated 9 months ago
- ☆120Updated last month
- A strongly typed Python DSL for developing message passing multi agent systems☆53Updated last year
- Interactive timeline of AI history☆61Updated last month
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 8 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆89Updated 2 weeks ago