EQ-bench / longform-writing-benchLinks
☆25Updated 2 months ago
Alternatives and similar repositories for longform-writing-bench
Users that are interested in longform-writing-bench are comparing it to the libraries listed below
Sorting:
- Test your local LLMs on the AIME problems☆31Updated 7 months ago
- Portal: GUI Tools for Agents☆25Updated 3 months ago
- Opensource chat app that uses Exa's API for web search and OpenAI o3-mini☆43Updated 7 months ago
- ☆47Updated last year
- Experiments with open source LLMs☆74Updated last week
- This AI agent analyzes code repositories, detects potential security vulnerabilities, reviews code quality, and suggests fixes based on S…☆12Updated 11 months ago
- Open sourced result for The Agent Company☆22Updated 2 months ago
- Transform Claude Code transcript JSONL files into readable terminal and HTML formats.☆55Updated 2 weeks ago
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆63Updated 3 months ago
- The DPAB-α Benchmark☆32Updated 11 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated last year
- A collection of example AI programs built using DSPy and maitained by the Langtrace AI team.☆49Updated last year
- ☆44Updated 6 months ago
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆31Updated 9 months ago
- Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.☆23Updated last year
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 11 months ago
- Welcome to FluidAPI, it's a framework that allows you to interact with APIs using natural language. No more JSON, headers, or complex for…☆32Updated 2 months ago
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Updated last year
- ☆24Updated 2 years ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆52Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆66Updated last year
- ☆22Updated last year
- The world's first fully automated VC fund.☆27Updated 2 weeks ago
- An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, …☆48Updated 2 months ago
- Call another MCP client from your MCP client. Offload context windows, delegate tasks, split between models☆29Updated 10 months ago
- LangChain + LiteLLM that works☆50Updated 4 months ago
- OpenPipe Reinforcement Learning Experiments☆32Updated 9 months ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24Updated 2 years ago
- ☆12Updated last year