Beating the GAIA benchmark with Transformers Agents. π
β151Feb 19, 2025Updated last year
Alternatives and similar repositories for GAIA
Users that are interested in GAIA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π§ Compare how Agent systems perform on several benchmarks. ππβ102Aug 4, 2025Updated 10 months ago
- β15Jun 2, 2025Updated last year
- β126Aug 13, 2024Updated last year
- A framework for few-shot evaluation of autoregressive language models.β13Jul 14, 2025Updated 11 months ago
- Developing a high-precision legal expert LLM application called Contract Advisor RAG. The project's goal is to create a Retrieval Augmentβ¦β16Apr 10, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Framework For Intelligence Farmingβ16Apr 3, 2025Updated last year
- [NeurIPS 2023] MoVie: Visual Model-Based Policy Adaptation for View Generalizationβ11Sep 22, 2023Updated 2 years ago
- β12Jul 25, 2023Updated 2 years ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ161Feb 11, 2025Updated last year
- β17Jul 3, 2025Updated 11 months ago
- Demos for my Talk at .NET Day Switzerlandβ13Aug 30, 2024Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ43Dec 29, 2025Updated 6 months ago
- β15May 26, 2026Updated last month
- OO for LLMsβ910Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A modular graph-based Retrieval-Augmented Generation (RAG) systemβ16May 28, 2026Updated last month
- Harness for running and evaluating AI agents against RL environmentsβ200Jun 20, 2026Updated last week
- π± Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMsβ73Mar 21, 2025Updated last year
- ζζ¬ηζ - ιθΏεεεζ°εεΎηθͺε¨ηζθ₯ιζζ¬β12Sep 17, 2021Updated 4 years ago
- β28Aug 9, 2025Updated 10 months ago
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructionsβ26Aug 8, 2024Updated last year
- Nvml( nvidia monitoring library) wrapper for c#.β15Mar 13, 2021Updated 5 years ago
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β823Oct 4, 2025Updated 8 months ago
- β15Jun 8, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β29Mar 10, 2026Updated 3 months ago
- AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Managementβ31Apr 10, 2026Updated 2 months ago
- Dev and Test Data of LogicGame benchmarkβ19Mar 31, 2025Updated last year
- Implementation for OAgents: An Empirical Study of Building Effective Agentsβ323Oct 13, 2025Updated 8 months ago
- C# SDK for the Hugging Face API -- inference, embeddings, and model hubβ57Updated this week
- β11Oct 11, 2023Updated 2 years ago
- This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by AndrΓ© V. Duarte, JoΓ£oβ¦β108Feb 9, 2026Updated 4 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.β47Sep 5, 2024Updated last year
- A cross platform library with C#/Swift/Kotlin/Python bindings for running Phi inferenceβ26Jun 5, 2026Updated 3 weeks ago
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- a python package for loadimg and converting imagesβ30Feb 18, 2026Updated 4 months ago
- β27Mar 5, 2024Updated 2 years ago
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teachesβ66Mar 4, 2025Updated last year
- Fast, simple tool to concatenate Git repositories into single files for LLM analysisβ19Mar 2, 2025Updated last year
- Official Implementation of "Affordable AI Assistants with Knowledge Graph of Thoughts"β230Mar 25, 2026Updated 3 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenariosβ71Aug 5, 2025Updated 10 months ago
- β21May 24, 2024Updated 2 years ago