Provider-agnostic, open-source evaluation infrastructure for language models
☆747Mar 16, 2026Updated last week
Alternatives and similar repositories for openbench
Users that are interested in openbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Realtime News and Information Eval☆18Nov 19, 2025Updated 4 months ago
- Build robust, production grade function calling assistants that work. Declarative and extensible. Built on top of LangChain ⚡️☆76May 21, 2024Updated last year
- Groq Compound Beta MCP Server☆45Feb 14, 2026Updated last month
- Local Groq Desktop chat app with MCP support☆383Feb 14, 2026Updated last month
- A compound AI voice assistant powered by Compound on Groq, equipped with realtime search capabilities.☆31Oct 20, 2025Updated 5 months ago
- Groq Public Changelog☆16Dec 2, 2025Updated 3 months ago
- Code Indexer Loop is a Python library for indexing and retrieving source code files through an integrated vector database that's continuo…☆176Apr 9, 2024Updated last year
- The official Node.js / Typescript library for the Groq API☆245Updated this week
- Build, enrich, and transform datasets using AI models with no code☆1,630Oct 23, 2025Updated 5 months ago
- ☆11Aug 26, 2024Updated last year
- ☆27Feb 11, 2026Updated last month
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- The official Python Library for the Groq API☆584Updated this week
- A highly customizable, lightweight, and open-source coding CLI powered by Groq for instant iteration.☆711Dec 19, 2025Updated 3 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,353Mar 9, 2026Updated 2 weeks ago
- Renderer for the harmony response format to be used with gpt-oss☆4,232Dec 15, 2025Updated 3 months ago
- moodist☆25Mar 13, 2026Updated last week
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆9,050Updated this week
- A fun multiplayer game built on Convex using Dall-E.☆17Feb 6, 2026Updated last month
- Context Engineering Course with DSPy☆216Jul 27, 2025Updated 7 months ago
- Semantic search and document parsing tools for the command line☆1,754Mar 11, 2026Updated last week
- An autonomous AI agent that plays Pokemon FireRed in real time using OpenAI's LLM, with a live web dashboard for monitoring.☆72Feb 15, 2026Updated last month
- Cookbook for Pipelex, the declarative language for composable Al workflows. Devtool for agents and mere humans.☆35Mar 15, 2026Updated last week
- ☆15Feb 23, 2026Updated last month
- Open-source clone of OpenAI's Deep Research. Works with any transformer, gpt4free, & runs in browser. No Firecrawl needed.☆12Jun 12, 2025Updated 9 months ago
- Everything about the SmolLM and SmolVLM family of models☆3,675Jan 13, 2026Updated 2 months ago
- Composable building blocks to build LLM Apps☆8,296Mar 16, 2026Updated last week
- The LLM Evaluation Framework☆14,115Mar 13, 2026Updated last week
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated last month
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app☆2,250Updated this week
- A long-context eval☆92Mar 11, 2026Updated last week
- ☆16May 31, 2025Updated 9 months ago
- Our library for RL environments + evals☆3,918Updated this week
- AI app to generate blog from youtube video url.☆14Nov 1, 2023Updated 2 years ago
- groq-gradio☆18Nov 19, 2025Updated 4 months ago
- Async RL Training at Scale☆1,156Updated this week
- dbSurface is a SQL editor made for pgvector.☆23Dec 6, 2025Updated 3 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year