mrconter1 / BenchmarkAggregatorLinks
Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.
☆16Updated last year
Alternatives and similar repositories for BenchmarkAggregator
Users that are interested in BenchmarkAggregator are comparing it to the libraries listed below
Sorting:
- GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 8 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆57Updated 7 months ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆105Updated 2 months ago
- Make Qwen3 Think like Gemini 2.5 Pro | Open webui function☆23Updated 5 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆50Updated last year
- ☆104Updated 4 months ago
- One Line To Build Zero-Data Classifiers in Minutes☆58Updated last year
- Simple examples using Argilla tools to build AI☆56Updated 11 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 10 months ago
- Benchmark that evaluates LLMs using 759 NYT Connections puzzles extended with extra trick words☆155Updated this week
- Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…☆39Updated 6 months ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- Very minimal (and stateless) agent framework☆45Updated 9 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆90Updated last month
- ☆49Updated last year
- ☆32Updated 2 years ago
- Thoughtful Lightning AI Assistant - Dual-engine system with DeepSeek reasoning and Groq inference, featuring Gradio UI, secure API manage…☆20Updated 8 months ago
- Pivotal Token Search☆128Updated 3 months ago
- ☆17Updated 10 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated last year
- ☆45Updated last year
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆52Updated 8 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆58Updated 5 months ago
- Generate Your Own Private Morning Radio for Commute☆33Updated 8 months ago
- ☆62Updated 3 months ago
- A collection of example AI programs built using DSPy and maitained by the Langtrace AI team.☆44Updated 11 months ago
- A framework for hosting and scaling AI agents.☆38Updated 10 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆57Updated 7 months ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆44Updated last year