Beating the GAIA benchmark with Transformers Agents. π
β150Feb 19, 2025Updated last year
Alternatives and similar repositories for GAIA
Users that are interested in GAIA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π§ Compare how Agent systems perform on several benchmarks. ππβ103Aug 4, 2025Updated 7 months ago
- β15Jun 2, 2025Updated 9 months ago
- A framework for few-shot evaluation of autoregressive language models.β12Jul 14, 2025Updated 8 months ago
- A modular graph-based Retrieval-Augmented Generation (RAG) systemβ14Updated this week
- β19Jul 24, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Framework For Intelligence Farmingβ16Apr 3, 2025Updated 11 months ago
- text2sql with modern LLMs (duckdb-nsql, SQLCoder etc ...)β18Apr 13, 2024Updated last year
- [NeurIPS 2023] MoVie: Visual Model-Based Policy Adaptation for View Generalizationβ11Sep 22, 2023Updated 2 years ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ161Feb 11, 2025Updated last year
- Harness for running and evaluating AI agents against RL environmentsβ135Mar 6, 2026Updated 3 weeks ago
- β18Jul 3, 2025Updated 8 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ42Dec 29, 2025Updated 2 months ago
- OO for LLMsβ901Mar 21, 2026Updated last week
- π± Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMsβ71Mar 21, 2025Updated last year
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructionsβ25Aug 8, 2024Updated last year
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β811Oct 4, 2025Updated 5 months ago
- AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Managementβ25Mar 17, 2026Updated last week
- β27Mar 10, 2026Updated 2 weeks ago
- Dev and Test Data of LogicGame benchmarkβ19Mar 31, 2025Updated 11 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agentsβ312Oct 13, 2025Updated 5 months ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welcβ¦β23Mar 4, 2024Updated 2 years ago
- β53Jul 31, 2025Updated 7 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".β32Feb 26, 2026Updated last month
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- β11Oct 11, 2023Updated 2 years ago
- a python package for loadimg and converting imagesβ29Feb 18, 2026Updated last month
- β27Mar 5, 2024Updated 2 years ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenariosβ70Aug 5, 2025Updated 7 months ago
- Fast, simple tool to concatenate Git repositories into single files for LLM analysisβ19Mar 2, 2025Updated last year
- β21May 24, 2024Updated last year
- Code repository for MMUGL: Multi-modal Graph Learning over UMLS Knowledge Graphsβ11Dec 7, 2023Updated 2 years ago
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ18Feb 14, 2025Updated last year
- The Elasticsearch adapter for Microsoft Kernel Memory.β19Aug 1, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- This repository is an offical PyTorch implementation of SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute.β13Mar 18, 2024Updated 2 years ago
- β12Feb 26, 2020Updated 6 years ago
- AgentTuning: Enabling Generalized Agent Abilities for LLMsβ1,484Oct 31, 2023Updated 2 years ago
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.β76Dec 25, 2024Updated last year
- Integration of Clinical Embeddings with Neural ODEsβ12Jan 6, 2025Updated last year
- β13Feb 17, 2025Updated last year
- β13May 6, 2025Updated 10 months ago