EQ-bench / longform-writing-benchLinks
☆24Updated last month
Alternatives and similar repositories for longform-writing-bench
Users that are interested in longform-writing-bench are comparing it to the libraries listed below
Sorting:
- Test your local LLMs on the AIME problems☆31Updated 6 months ago
- Resources regarding evML (edge verified machine learning)☆18Updated 11 months ago
- A lightweight code assistant with tool-using capabilities built on HuggingFace's smolagents.☆39Updated 6 months ago
- Portal: GUI Tools for Agents☆25Updated 2 months ago
- Open sourced result for The Agent Company☆22Updated last month
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆63Updated 2 months ago
- Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.☆23Updated last year
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆31Updated 8 months ago
- The DPAB-α Benchmark☆32Updated 11 months ago
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆34Updated 8 months ago
- General benchmarking apparatus for running multi-agent systems against benchmarks☆34Updated this week
- This AI agent analyzes code repositories, detects potential security vulnerabilities, reviews code quality, and suggests fixes based on S…☆12Updated 10 months ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24Updated 2 years ago
- ☆44Updated 5 months ago
- extract all your personal data history from cursor, codex, claude-code, windsurf, and trae☆118Updated last month
- Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.☆31Updated 3 months ago
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Updated last year
- ☆25Updated 2 months ago
- Twitter-RapidAPI-MCP-X is a lightweight API available on RapidAPI that provides streamlined access to Twitter data, including tweets, use…☆22Updated 8 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆51Updated last year
- Your personal ArXiv Feed☆23Updated last year
- Running Microsoft's BitNet via Electron, React & Astro☆48Updated 2 months ago
- Build AI Agents with Your Existing Python Code!☆70Updated last year
- Waffer-thin FlaskGPT on Vercel.☆12Updated 2 years ago
- ☆22Updated last year
- Opensource chat app that uses Exa's API for web search and OpenAI o3-mini☆43Updated 6 months ago
- Estimating hardware and cloud costs of LLMs and transformer projects☆20Updated this week
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated 11 months ago
- Dive endlessly deeper into a single concept using AI☆99Updated 8 months ago
- Transform Claude Code transcript JSONL files into readable terminal and HTML formats.☆50Updated last week