lechmazur / writing
This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, motivations, etc.) in a short creative story
☆177Updated last week
Alternatives and similar repositories for writing:
Users that are interested in writing are comparing it to the libraries listed below
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆48Updated this week
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated last month
- A benchmark for emotional intelligence in large language models☆275Updated 8 months ago
- Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…☆35Updated last week
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆124Updated this week
- ☆284Updated last week
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated 11 months ago
- Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words☆76Updated this week
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆121Updated last week
- Easily view and modify JSON datasets for large language models☆73Updated last month
- idea: https://github.com/nyxkrage/ebook-groupchat/☆86Updated 8 months ago
- A multimodal, function calling powered LLM webui.☆214Updated 6 months ago
- A user interface for DSPy☆143Updated 5 months ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆149Updated last year
- A benchmark for role-playing language models☆92Updated this week
- Self-hosted LLM chatbot arena, with yourself as the only judge☆39Updated last year
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆429Updated 6 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated 11 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆65Updated 5 months ago
- A pipeline parallel training script for LLMs.☆137Updated 2 weeks ago
- II-Researcher: a new open-source framework designed to aid building search / research agents☆238Updated this week
- ☆436Updated 6 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated 11 months ago
- automatically quant GGUF models☆167Updated this week
- ☆112Updated 4 months ago
- ☆153Updated 9 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 2 months ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆100Updated this week
- ☆198Updated this week
- ☆84Updated 3 months ago