WecoAI / aidemlLinks
AIDE: AI-Driven Exploration in the Space of Code. State of the Art machine Learning engineering agents that automates AI R&D.
☆912Updated last month
Alternatives and similar repositories for aideml
Users that are interested in aideml are comparing it to the libraries listed below
Sorting:
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆728Updated 2 weeks ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,311Updated 4 months ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆1,699Updated 5 months ago
- Autonomous Agents (LLMs) research papers. Updated Daily.☆819Updated this week
- Code and Data for Tau-Bench☆528Updated 4 months ago
- ☆1,024Updated 5 months ago
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆892Updated 2 weeks ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆499Updated 2 weeks ago
- End-to-end Generative Optimization for AI Agents☆575Updated last week
- Automatic evals for LLMs☆399Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆474Updated 3 weeks ago
- Verifiers for LLM Reinforcement Learning☆1,057Updated this week
- Synthetic data curation for post-training and structured data extraction☆1,364Updated this week
- ☆597Updated 4 months ago
- CodeScientist: An automated scientific discovery system for code-based experiments☆263Updated 2 months ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆948Updated last month
- Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding☆382Updated last year
- Code for Quiet-STaR☆732Updated 9 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and re…☆334Updated this week
- ⚖️ The First Coding Agent-as-a-Judge☆524Updated 2 weeks ago
- xLAM: A Family of Large Action Models to Empower AI Agent Systems☆448Updated last week
- AWM: Agent Workflow Memory☆271Updated 4 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆270Updated this week
- OpenResearcher, an advanced Scientific Research Assistant☆450Updated 7 months ago
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆344Updated 11 months ago
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi e…☆474Updated 2 months ago
- Automatically evaluate your LLMs in Google Colab☆629Updated last year
- An agent benchmark with tasks in a simulated software company.☆370Updated 2 weeks ago
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhan…☆1,214Updated last year
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆530Updated 2 months ago