lechmazur / elimination_game
A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private conversations, form alliances, and vote to eliminate each other
☆77Updated this week
Alternatives and similar repositories for elimination_game:
Users that are interested in elimination_game are comparing it to the libraries listed below
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆42Updated this week
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆116Updated this week
- ☆52Updated this week
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated last month
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆71Updated 6 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆27Updated 2 months ago
- Easily view and modify JSON datasets for large language models☆71Updated 3 weeks ago
- Deploy Apollo HF space locally☆40Updated 3 months ago
- Experimental LLM Inference UX to aid in creative writing☆114Updated 3 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆64Updated 4 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆21Updated this week
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It pro…☆64Updated 4 months ago
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆24Updated this week
- Use the Moondream 2 model to detect faces and their gaze directions in videos.☆39Updated 2 months ago
- Benchmark that evaluates LLMs using 601 NYT Connections puzzles extended with extra trick words☆66Updated this week
- LLM backed Fantasy Tribe Game☆18Updated 4 months ago
- ☆67Updated 3 weeks ago
- A Conversational Speech Generation Model with Gradio UI and support for CUDA, MLX and CPU devices☆131Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆54Updated last month
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆138Updated last month
- ☆275Updated 2 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 7 months ago
- This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, moti…☆140Updated this week
- entropix style sampling + GUI☆25Updated 4 months ago
- Sesame Converse - Real Time Conversations - Powered by Gemma 3☆55Updated last week
- ☆125Updated last week
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆32Updated this week
- This repo provides a simple Gradio UI to run Qwen2 VL 72B AWQ in venv and have both image and video inferencing work.☆29Updated 5 months ago
- ☆111Updated 3 months ago
- AI management tool☆113Updated 4 months ago