EQ-bench / longform-writing-benchLinks
☆24Updated 3 weeks ago
Alternatives and similar repositories for longform-writing-bench
Users that are interested in longform-writing-bench are comparing it to the libraries listed below
Sorting:
- Resources regarding evML (edge verified machine learning)☆19Updated 10 months ago
- Open sourced result for The Agent Company☆22Updated last week
- Portal: GUI Tools for Agents☆25Updated 2 months ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated 10 months ago
- Python library for Entities, relationships and schemas extraction from documents☆44Updated 11 months ago
- An AI-powered game playing agent using Claude and PyBoy☆34Updated 8 months ago
- Test your local LLMs on the AIME problems☆31Updated 5 months ago
- Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.☆23Updated last year
- An LLM playground similar to the OpenAI API playground☆21Updated last year
- ☆44Updated 4 months ago
- 🤖 A list of latest AGI-related repos, resources and courses including LLMs and AI Agents.☆13Updated last year
- A minimal Model Context Protocol 🖥️ server/client🧑💻with Azure OpenAI and 🌐 web browser control via Playwright.☆31Updated 7 months ago
- ☆31Updated 3 months ago
- OpenPipe Reinforcement Learning Experiments☆32Updated 8 months ago
- A lightweight code assistant with tool-using capabilities built on HuggingFace's smolagents.☆39Updated 5 months ago
- ☆24Updated last year
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆31Updated 8 months ago
- The official Python library for Formulaic☆17Updated last year
- A mcp server that uses the Osmosis-Apply-1.7B model to apply code merges☆53Updated 4 months ago
- Experiments with open source LLMs☆74Updated 2 months ago
- Run AI generated code in isolated sandboxes☆122Updated 9 months ago
- This AI agent analyzes code repositories, detects potential security vulnerabilities, reviews code quality, and suggests fixes based on S…☆11Updated 9 months ago
- Your personal deep research ai agent☆24Updated 6 months ago
- MCP to explore websites with llms.txt files☆70Updated 6 months ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24Updated 2 years ago
- An md file as a chat interface and editable history in one.☆65Updated 2 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
- ☆47Updated last year
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Updated last year
- Generate a wiki for your research topic, sourcing from the web and your docs.☆52Updated 8 months ago