EQ-bench / longform-writing-benchLinks
☆20Updated 2 months ago
Alternatives and similar repositories for longform-writing-bench
Users that are interested in longform-writing-bench are comparing it to the libraries listed below
Sorting:
- Test your local LLMs on the AIME problems☆32Updated 2 months ago
- Open sourced result for The Agent Company☆18Updated last week
- This AI agent analyzes code repositories, detects potential security vulnerabilities, reviews code quality, and suggests fixes based on S…☆10Updated 6 months ago
- Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.☆23Updated 8 months ago
- OpenPipe Reinforcement Learning Experiments☆30Updated 5 months ago
- A CLI in Rust to generate synthetic data for MLX friendly training☆24Updated last year
- 🤖 A list of latest AGI-related repos, resources and courses including LLMs and AI Agents.☆12Updated 10 months ago
- An AI-powered game playing agent using Claude and PyBoy☆29Updated 5 months ago
- A minimal Model Context Protocol 🖥️ server/client🧑💻with Azure OpenAI and 🌐 web browser control via Playwright.☆27Updated 4 months ago
- Discover advanced AI techniques in my repository combining Multi-Hop Chain of Thought (CoT) and Retrieval-Augmented Generation (RAG) usin…☆14Updated last year
- Portal: GUI Tools for Agents☆25Updated 4 months ago
- ☆42Updated last month
- A chat UI for Llama.cpp☆15Updated 3 weeks ago
- ☆28Updated 11 months ago
- A lightweight code assistant with tool-using capabilities built on HuggingFace's smolagents.☆36Updated 2 months ago
- ☆21Updated 2 weeks ago
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆29Updated 4 months ago
- Experiments with open source LLMs☆74Updated 2 weeks ago
- An interface for llama.cpp, ChatGPT, Gemini, and Claude☆28Updated this week
- Fast inference of Instruct tuned LLaMa on your personal devices.☆22Updated 2 years ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆23Updated 2 years ago
- ☆17Updated 3 months ago
- Your personal ArXiv Feed☆23Updated 7 months ago
- Python library for Entities, relationships and schemas extraction from documents☆41Updated 8 months ago
- An LLM playground similar to the OpenAI API playground☆22Updated last year
- LangChain + LiteLLM that works☆46Updated 2 months ago
- MCP server for Chroma☆36Updated 7 months ago
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆63Updated this week
- Resources regarding evML (edge verified machine learning)☆18Updated 7 months ago
- auto fine tune of models with synthetic data☆76Updated last year