OpenGenerativeAI / llm-colosseumLinks
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
☆1,447Updated 6 months ago
Alternatives and similar repositories for llm-colosseum
Users that are interested in llm-colosseum are comparing it to the libraries listed below
Sorting:
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,411Updated 9 months ago
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,687Updated last year
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆857Updated last year
- ☆447Updated last year
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,311Updated 7 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆987Updated last year
- Mora: More like Sora for Generalist Video Generation☆1,570Updated 11 months ago
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,386Updated last month
- ☆3,016Updated last year
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,752Updated last year
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,402Updated last year
- [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.☆1,416Updated last week
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophist…☆1,679Updated last year
- ☆866Updated last year
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆2,148Updated this week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,825Updated 8 months ago
- Open-source tool to visualise your RAG 🔮☆1,154Updated 8 months ago
- A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.☆2,926Updated this week
- Fine-tune LLM agents with online reinforcement learning☆1,230Updated last year
- [ICLR 2025] Automated Design of Agentic Systems☆1,421Updated 7 months ago
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-be…☆3,006Updated 4 months ago
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild☆4,454Updated 10 months ago
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆791Updated 2 months ago
- A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxi…☆973Updated last year
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,365Updated 9 months ago
- Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.☆1,107Updated 2 months ago
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆2,048Updated 10 months ago
- Training LLMs with QLoRA + FSDP☆1,528Updated 10 months ago
- A trivial programmatic Llama 3 jailbreak. Sorry Zuck!☆564Updated 7 months ago
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,892Updated 9 months ago