JonathanChavezTamales / llm-leaderboardLinks
A comprehensive set of LLM benchmark scores and provider prices.
☆303Updated 3 weeks ago
Alternatives and similar repositories for llm-leaderboard
Users that are interested in llm-leaderboard are comparing it to the libraries listed below
Sorting:
- Provider-agnostic, open-source evaluation infrastructure for language models☆492Updated this week
- Instantly calculate the maximum size of quantized language models that can fit in your available RAM, helping you optimize your models fo…☆236Updated 4 months ago
- Hallucination Detector is a free and open-source tool that helps you verify the accuracy of your LLM generated content instantly.☆281Updated 2 months ago
- You don’t need to read the code to understand how to build!☆206Updated 7 months ago
- ☆322Updated 4 months ago
- An open-source dashboard for Cursor.sh IDE. Log AI code generations, track usage, and control AI models (including local ones). Run local…☆360Updated 10 months ago
- A timeline of notable generative AI events☆125Updated last week
- Together Open Deep Research☆342Updated 4 months ago
- Routing on Random Forest (RoRF)☆200Updated 11 months ago
- Coding problems used in aider's polyglot benchmark☆175Updated 8 months ago
- llmbasedos — Local-First OS Where Your AI Agents Wake Up and Work☆271Updated 2 weeks ago
- beep boop 🤖 (experimental)☆114Updated 7 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆218Updated 3 weeks ago
- Claude Memory: Long-term memory for Claude☆555Updated last week
- Smithery helps AI agents access external services via a unified gateway.☆272Updated last week
- Giving Claude ability to run code with E2B via MCP (Model Context Protocol)☆314Updated last month
- ☆147Updated 3 months ago
- Overide (pronounced over·ide) is a lightweight, yet powerful CLI tool that seamlessly integrates AI-powered code generation into your dev…☆183Updated last month
- An open-source VSCode extension, the AI coding assistant, integrates with Ollama, HuggingFace, OpenAI, and Anthropic.☆259Updated last month
- For LLMs to better code with Jina API☆165Updated last month
- Claude Deep Research config for Claude Code.☆212Updated 5 months ago
- ☆133Updated 4 months ago
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.☆1,081Updated this week
- Letting Claude Code develop his own MCP tools :)☆122Updated 5 months ago
- A simple MCP integration that allows Claude to read and manage a personal Notion todo list☆200Updated 8 months ago
- Spark Stack is an tool for building web applications through an AI-powered chat interface. Create quick MVPs and prototypes using natural…☆245Updated 3 months ago
- ☆220Updated 7 months ago
- Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"☆143Updated last month
- AI agents platform that gives you a workspace with an integrated team of personal assistants that can work behind the scenes to handle da…☆180Updated last month
- Superinterface is an AI assistants library for building AI capabilities into your app or website. You use React components and hooks to b…☆280Updated last week