lechmazur / writing
This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, motivations, etc.) in a short creative story
☆202Updated this week
Alternatives and similar repositories for writing
Users that are interested in writing are comparing it to the libraries listed below
Sorting:
- Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…☆36Updated last month
- ☆288Updated last month
- A benchmark for emotional intelligence in large language models☆289Updated 9 months ago
- ☆106Updated last week
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 2 months ago
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆49Updated this week
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆140Updated this week
- Easily view and modify JSON datasets for large language models☆75Updated 2 months ago
- II-Researcher: a new open-source framework designed to aid building search / research agents☆248Updated this week
- AI management tool☆114Updated 6 months ago
- ☆156Updated 9 months ago
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆128Updated last week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated last year
- Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"☆68Updated last week
- A pipeline parallel training script for LLMs.☆143Updated last week
- Atom of Thoughts for Markov LLM Test-Time Scaling☆562Updated last week
- Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words☆85Updated this week
- ☆113Updated 4 months ago
- ☆88Updated 2 months ago
- ☆437Updated 7 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆173Updated last year
- The future of AI roleplay☆90Updated 2 months ago
- Open source LLM UI, compatible with all local LLM providers.☆174Updated 7 months ago
- Orpheus Chat WebUI☆53Updated last month
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆67Updated 6 months ago
- A user interface for DSPy☆144Updated 6 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated last year
- Agent that writes consistent and interesting long stories for any fiction form☆92Updated 5 months ago
- Efficient visual programming for AI language models☆361Updated 8 months ago