lechmazur / pgg_benchLinks

Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies among Large Language Models (LLMs) in a resource-sharing economic scenario. Our experiment extends the classic PGG with a punishment phase, allowing players to penalize free-riders or retaliate against others.

☆37

Alternatives and similar repositories for pgg_bench

Users that are interested in pgg_bench are comparing it to the libraries listed below

Sorting:

Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆140Updated 4 months ago
femto / minion
👷‍♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…
☆23Updated this week
uukuguy / speechless
LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.
☆104Updated last month
severian42 / Computational-Model-for-Symbolic-Representations
Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …
☆49Updated 5 months ago
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 5 months ago
AtakanTekparmak / agento
Very minimal (and stateless) agent framework
☆44Updated 6 months ago
YerbaPage / MGDebugger
Multi-Granularity LLM Debugger
☆82Updated last week
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆26Updated 8 months ago
kenhktsui / anyclassifier
One Line To Build Zero-Data Classifiers in Minutes
☆58Updated 9 months ago
reka-ai / rekaquant
☆49Updated this week
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆62Updated 10 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆54Updated 5 months ago
ArturTanona / grpo_unsloth_docker
☆57Updated 5 months ago
lechmazur / step_game
Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…
☆55Updated last month
TechxGenus / CursorCore
CursorCore: Assist Programming through Aligning Anything
☆127Updated 5 months ago
lechmazur / nyt-connections
Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words
☆119Updated this week
CogNLP / CogAGENT
☆35Updated 2 years ago
willkurt / token-explorer
A simple tool that let's you explore different possible paths that an LLM might sample.
☆174Updated 2 months ago
shirley-wu / cot_decoding
☆45Updated last year
Zyphra / transformers_zamba2
☆48Updated 5 months ago
AlexBodner / How_Much_VRAM
☆101Updated 10 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆88Updated 2 months ago
LLM360 / crystalcoder-data-prep
Data preparation code for CrystalCoder 7B LLM
☆45Updated last year
teknium1 / ShareGPT-Builder
☆115Updated 6 months ago
axolotl-ai-cloud / grpo_code
A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.
☆32Updated 3 months ago
allenai / infinigram-api
☆69Updated last month
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆81Updated last month
nyunAI / PruneGPT
☆52Updated last year
cognitivecomputations / OpenChatML
☆157Updated last year
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 7 months ago