om-ai-lab / open-agent-leaderboard
Reproducible Language Agent Research
☆19Updated 3 weeks ago
Alternatives and similar repositories for open-agent-leaderboard:
Users that are interested in open-agent-leaderboard are comparing it to the libraries listed below
- ☆13Updated 3 months ago
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆24Updated 3 months ago
- ☆41Updated 3 months ago
- ☆20Updated last month
- ☆24Updated 6 months ago
- ☆60Updated last month
- ☆25Updated last month
- The official repo for the code and data of paper SMART☆22Updated last month
- ☆34Updated 3 months ago
- ☆42Updated this week
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆12Updated 5 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆22Updated 2 weeks ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 3 months ago
- ☆19Updated 4 months ago
- Open-source examples and guides for building with the Qwen. Browse a collection of snippets, advanced techniques and walkthroughs.☆20Updated 4 months ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆97Updated 3 weeks ago
- A collection of strong multimodal models for building multimodal AGI agents☆41Updated 8 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- ☆32Updated 3 weeks ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- Github repo for Peifeng's internship project☆14Updated last year
- ☆24Updated 8 months ago
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated last month
- ☆36Updated 2 years ago
- ☆18Updated 7 months ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.☆18Updated 5 months ago
- ☆15Updated 6 months ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆71Updated this week
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆82Updated this week