aymeric-roucher/GAIA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aymeric-roucher/GAIA)

aymeric-roucher / GAIA

Beating the GAIA benchmark with Transformers Agents. 🚀

☆153

Alternatives and similar repositories for GAIA

Users that are interested in GAIA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aymeric-roucher / agent_reasoning_benchmark
View on GitHub
🔧 Compare how Agent systems perform on several benchmarks. 📊🚀
☆102Aug 4, 2025Updated 11 months ago
amazon-science / graph-lm-ensemble
View on GitHub
☆15Jun 2, 2025Updated last year
Ag2S1 / Sibyl-System
View on GitHub
☆125Aug 13, 2024Updated last year
OpenGPTX / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of autoregressive language models.
☆13Jul 14, 2025Updated last year
SparkJiao / StructTest
View on GitHub
☆19Jul 24, 2025Updated 11 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
zjunlp / xKG
View on GitHub
Executable Knowledge Graphs for Replicating AI Research
☆16Jul 9, 2026Updated last week
diptimanr / text2sql
View on GitHub
text2sql with modern LLMs (duckdb-nsql, SQLCoder etc ...)
☆18Apr 13, 2024Updated 2 years ago
McGill-NLP / weblinx
View on GitHub
WebLINX is a benchmark for building web navigation agents with conversational capabilities
☆162Feb 11, 2025Updated last year
graphrag / ms-graphrag
View on GitHub
A modular graph-based Retrieval-Augmented Generation (RAG) system
☆16Updated this week
Yikai-Liao / efficient_bpe
View on GitHub
An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation
☆13Sep 9, 2024Updated last year
sail-sg / sailor2
View on GitHub
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
☆73Mar 21, 2025Updated last year
huggingface / feel
View on GitHub
☆15May 26, 2026Updated last month
zhangir-azerbayev / MetaMath
View on GitHub
☆11Oct 11, 2023Updated 2 years ago
OpenBMB / IoA
View on GitHub
An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through in…
☆826Oct 4, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Hypatiaalegra / LogicGame-Data
View on GitHub
Dev and Test Data of LogicGame benchmark
☆19Mar 31, 2025Updated last year
distil-labs / distil-dlthub-models-from-traces
View on GitHub
Demo repository showing how to create a model from production traces
☆16Mar 4, 2026Updated 4 months ago
danton267 / dash-streaming-GPT-app
View on GitHub
☆15Jun 8, 2023Updated 3 years ago
PrithivirajDamodaran / blitz-embed
View on GitHub
C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…
☆24Mar 4, 2024Updated 2 years ago
sccn / ICLabel-Dataset
View on GitHub
Dataset for training EEG IC classifiers.
☆14Aug 29, 2021Updated 4 years ago
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 6 months ago
JoeYing1019 / UltraTool
View on GitHub
[ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
☆71Aug 5, 2025Updated 11 months ago
AlexCheema / tinygrad
View on GitHub
You like pytorch? You like micrograd? You love tinygrad! ❤️
☆18Feb 14, 2025Updated last year
microsoft / data-in-use-protection-workshop
View on GitHub
A complete workshop content with a series of tracks and hands-on labs on various techniques to protect data in use.
☆13Sep 10, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
davidberenstein1957 / dataset-viber
View on GitHub
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Sep 5, 2024Updated last year
Timothyxxx / KVCachePapers
View on GitHub
☆20May 24, 2024Updated 2 years ago
wangbx66 / differentially-private-q-learning
View on GitHub
☆13May 16, 2019Updated 7 years ago
plaggy / rag-containers
View on GitHub
Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.
☆76Dec 25, 2024Updated last year
freemindlabsinc / FreeMindLabs.KernelMemory.Elasticsearch
View on GitHub
The Elasticsearch adapter for Microsoft Kernel Memory.
☆19Aug 1, 2024Updated last year
yuki-younai / Jailbreak-R1
View on GitHub
offical implementation of Jailbreak-R1
☆15Jul 16, 2025Updated last year
THUDM / AgentTuning
View on GitHub
AgentTuning: Enabling Generalized Agent Abilities for LLMs
☆1,500Oct 31, 2023Updated 2 years ago
ratschlab / mmugl
View on GitHub
Code repository for MMUGL: Multi-modal Graph Learning over UMLS Knowledge Graphs
☆11Dec 7, 2023Updated 2 years ago
henryzhao5852 / DELFT
View on GitHub
☆12Feb 26, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
valence-labs / Tx-Evaluation
View on GitHub
Bencharking pipeline for evaluating Transcriptomic representations for perturbation tasks
☆14Nov 5, 2024Updated last year
E5Anant / UnisonAI
View on GitHub
The UnisonAI Multi-Agent Framework built on custom workflow which allows ai agents to talk together and provides a flexible and extensibl…
☆23Feb 24, 2026Updated 4 months ago
yao8839836 / cp
View on GitHub
☆13Feb 17, 2025Updated last year
ggsharma / microgradpp
View on GitHub
A header-only C++ autograd engine and neural network library inspired by Karpathy's micrograd. Learn backpropagation in modern C++17.
☆16Jan 14, 2026Updated 6 months ago
BaroudLab / Griottes
View on GitHub
Python program to generate NetworkX graphs from segmented images.
☆14Apr 14, 2023Updated 3 years ago
barahona-research-group / ICE-NODE
View on GitHub
Integration of Clinical Embeddings with Neural ODEs
☆12Jan 6, 2025Updated last year
inclusionAI / AWorld
View on GitHub
Search, understand, reproduce, and improve an idea with ease
☆1,212Updated this week