thesofakillers / aidemlLinks
AIDE: the Machine Learning CodeGen Agent
β24Updated last year
Alternatives and similar repositories for aideml
Users that are interested in aideml are comparing it to the libraries listed below
Sorting:
- π§ Compare how Agent systems perform on several benchmarks. ππβ102Updated 3 months ago
- β40Updated 11 months ago
- β80Updated last week
- Codebase accompanying the Summary of a Haystack paper.β79Updated last year
- Verifiers for LLM Reinforcement Learningβ79Updated 7 months ago
- Official Repo for CRMArena and CRMArena-Proβ125Updated last week
- Repository for βPlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makersβ, NAACL24β150Updated last year
- LLM reads a paper and produce a working prototypeβ57Updated 7 months ago
- The first dense retrieval model that can be prompted like an LMβ89Updated 6 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β116Updated 3 weeks ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"β134Updated 2 years ago
- β61Updated 11 months ago
- β51Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ123Updated 2 weeks ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ109Updated 11 months ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [Fβ¦β69Updated last year
- β28Updated 7 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.β50Updated last year
- Beating the GAIA benchmark with Transformers Agents. πβ138Updated 9 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Modelsβ99Updated 2 years ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive argumentsβ92Updated last month
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performanceβ91Updated 11 months ago
- EcoAssistant: using LLM assistant more affordably and accuratelyβ133Updated last year
- Automating enterprise workflows with multimodal agentsβ112Updated last year
- Submodular optimization for context engineering: query fan-out, text selection, passage rerankingβ77Updated 4 months ago
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?β81Updated 3 months ago
- Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"β55Updated 3 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" π€β76Updated 11 months ago
- β146Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.β115Updated 3 months ago