OpenGenerativeAI / llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
☆1,416Updated last month
Alternatives and similar repositories for llm-colosseum:
Users that are interested in llm-colosseum are comparing it to the libraries listed below
- SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?☆2,862Updated this week
- AIOS: AI Agent Operating System☆4,060Updated last week
- ☆2,729Updated this week
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆1,799Updated this week
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,398Updated 4 months ago
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,753Updated last month
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,646Updated 7 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,033Updated last week
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,671Updated 9 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,596Updated 8 months ago
- A library for generative social simulation☆854Updated last week
- ☆1,162Updated 9 months ago
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,268Updated this week
- ☆1,640Updated last week
- Harness LLMs with Multi-Agent Programming☆3,234Updated this week
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,407Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,060Updated last month
- Fine-tune LLM agents with online reinforcement learning☆1,108Updated last year
- PyTorch native post-training library☆5,123Updated this week
- ☆444Updated last year
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,042Updated last month
- Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.☆1,051Updated last month
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app☆1,682Updated last week
- Reaching LLaMA2 Performance with 0.1M Dollars☆981Updated 9 months ago
- ☆2,915Updated 7 months ago
- Deploy your agentic worfklows to production☆1,998Updated this week
- Open-source tool to visualise your RAG 🔮☆1,122Updated 3 months ago
- Knowledge Agents and Management in the Cloud☆3,906Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆3,848Updated 8 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,156Updated last month