Provider-agnostic, open-source evaluation infrastructure for language models
☆764Apr 27, 2026Updated this week
Alternatives and similar repositories for openbench
Users that are interested in openbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Realtime News and Information Eval☆18Mar 26, 2026Updated last month
- Build robust, production grade function calling assistants that work. Declarative and extensible. Built on top of LangChain ⚡️☆76May 21, 2024Updated last year
- Groq Compound Beta MCP Server☆50Feb 14, 2026Updated 2 months ago
- Local Groq Desktop chat app with MCP support☆390Feb 14, 2026Updated 2 months ago
- A compound AI voice assistant powered by Compound on Groq, equipped with realtime search capabilities.☆31Oct 20, 2025Updated 6 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Inspect: A framework for large language model evaluations☆1,974Updated this week
- Groq Public Changelog☆17Dec 2, 2025Updated 5 months ago
- Code Indexer Loop is a Python library for indexing and retrieving source code files through an integrated vector database that's continuo…☆176Apr 9, 2024Updated 2 years ago
- The official Node.js / Typescript library for the Groq API☆248Updated this week
- Build, enrich, and transform datasets using AI models with no code☆1,631Apr 9, 2026Updated 3 weeks ago
- TVRecap: A Dataset for Generating Stories with Character Descriptions☆21Jun 5, 2023Updated 2 years ago
- Open-source library for scalable, reproducible evaluation of AI models and benchmarks.☆271Updated this week
- ☆11Aug 26, 2024Updated last year
- ☆28Feb 11, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Base project for bootstrapping frontend projects☆16Jan 28, 2026Updated 3 months ago
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- The official Python Library for the Groq API☆600Updated this week
- A highly customizable, lightweight, and open-source coding CLI powered by Groq for instant iteration.☆725Dec 19, 2025Updated 4 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,396Apr 17, 2026Updated 2 weeks ago
- moodist☆27Apr 23, 2026Updated last week
- Fork of Flame repo for training of some new stuff in development☆19Apr 24, 2026Updated last week
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆119Jul 31, 2025Updated 9 months ago
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆9,382Updated this week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A fun multiplayer game built on Convex using Dall-E.☆18Feb 6, 2026Updated 2 months ago
- Renderer for the harmony response format to be used with gpt-oss☆4,343Apr 8, 2026Updated 3 weeks ago
- Context Engineering Course with DSPy☆219Jul 27, 2025Updated 9 months ago
- Semantic search and document parsing tools for the command line☆1,780Mar 11, 2026Updated last month
- Open GenAI Stack☆8,350Apr 26, 2026Updated last week
- Cookbook for Pipelex, the declarative language for composable Al workflows. Devtool for agents and mere humans.☆35Apr 17, 2026Updated 2 weeks ago
- ☆15Feb 23, 2026Updated 2 months ago
- Custom hooks for pi coding agent☆103Mar 22, 2026Updated last month
- Simple repository for training small reasoning models☆50Feb 17, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 3% Is All You Need: Breaking TurboQuant's Compression Limit via Spectral Structure☆129Apr 7, 2026Updated 3 weeks ago
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app☆2,299Updated this week
- The LLM Evaluation Framework☆14,993Apr 26, 2026Updated last week
- AI app to generate blog from youtube video url.☆14Nov 1, 2023Updated 2 years ago
- Everything about the SmolLM and SmolVLM family of models☆3,755Apr 2, 2026Updated last month
- ☆16May 31, 2025Updated 11 months ago
- Evals that meet you where you are. For AI that's grounded.☆57Mar 21, 2026Updated last month