OpenGenerativeAI / llm-colosseumLinks
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
☆1,447Updated 6 months ago
Alternatives and similar repositories for llm-colosseum
Users that are interested in llm-colosseum are comparing it to the libraries listed below
Sorting:
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,407Updated 10 months ago
- ☆447Updated last year
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,765Updated last year
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,687Updated last year
- Training LLMs with QLoRA + FSDP☆1,527Updated 11 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆985Updated last year
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,368Updated 10 months ago
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,325Updated last year
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,393Updated 2 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,399Updated last year
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,309Updated 7 months ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,430Updated 8 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,119Updated last week
- Fine-tune LLM agents with online reinforcement learning☆1,234Updated last year
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆855Updated last year
- Chat language model that can use tools and interpret the results☆1,582Updated 3 weeks ago
- ☆1,034Updated 9 months ago
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,834Updated 9 months ago
- ☆1,093Updated last year
- Code for Quiet-STaR☆740Updated last year
- prompt2model - Generate Deployable Models from Natural Language Instructions☆2,006Updated 9 months ago
- Open-source tool to visualise your RAG 🔮☆1,171Updated 9 months ago
- [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.☆1,422Updated last month
- A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxi…☆971Updated last year
- ☆865Updated last year
- ☆3,028Updated last year
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆794Updated 2 months ago
- Mamba-Chat: A chat LLM based on the state-space model architecture 🐍☆932Updated last year
- A library for advanced large language model reasoning☆2,287Updated 4 months ago
- A library for generative social simulation☆1,032Updated last week