OpenGenerativeAI / llm-colosseumLinks
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
☆1,448Updated 7 months ago
Alternatives and similar repositories for llm-colosseum
Users that are interested in llm-colosseum are comparing it to the libraries listed below
Sorting:
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,771Updated last year
- ☆446Updated last year
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,408Updated 10 months ago
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,689Updated last year
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,833Updated 9 months ago
- Training LLMs with QLoRA + FSDP☆1,527Updated 11 months ago
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆855Updated last year
- [ICLR 2025] Automated Design of Agentic Systems☆1,439Updated 9 months ago
- ☆1,035Updated 10 months ago
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,396Updated 3 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆986Updated last year
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,398Updated last year
- Ship RAG based LLM web apps in seconds.☆997Updated last year
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆4,365Updated last year
- A framework for prompt tuning using Intent-based Prompt Calibration☆2,814Updated 6 months ago
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,641Updated this week
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,374Updated 11 months ago
- ☆1,111Updated last year
- Open-source tool to visualise your RAG 🔮☆1,171Updated 9 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,723Updated 5 months ago
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,327Updated last year
- Deploy your agentic worfklows to production☆2,058Updated 2 months ago
- A tool for generating function arguments and choosing what function to call with local LLMs☆432Updated last year
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,308Updated 8 months ago
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,897Updated 11 months ago
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-be…☆3,024Updated 6 months ago
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting wit…☆1,101Updated last year
- Mora: More like Sora for Generalist Video Generation☆1,572Updated last year
- Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"☆2,510Updated 10 months ago
- The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling☆723Updated 11 months ago