thesofakillers / aideml
AIDE: the Machine Learning CodeGen Agent
☆24Updated 6 months ago
Alternatives and similar repositories for aideml:
Users that are interested in aideml are comparing it to the libraries listed below
- ☆41Updated 3 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆49Updated last month
- ☆24Updated 6 months ago
- LLM reads a paper and produce a working prototype☆51Updated 3 weeks ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆94Updated 5 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆65Updated 9 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆77Updated 6 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆102Updated 3 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆96Updated last year
- ☆55Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 6 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆75Updated last year
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆65Updated 4 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆18Updated this week
- Train your own SOTA deductive reasoning model☆81Updated last month
- ☆62Updated last week
- ☆73Updated 2 months ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.☆101Updated 8 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆17Updated last week
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆53Updated 6 months ago
- ☆109Updated 2 weeks ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- ☆33Updated last year
- Functional Benchmarks and the Reasoning Gap☆84Updated 6 months ago
- ☆50Updated 4 months ago
- ☆48Updated 5 months ago
- ☆75Updated this week
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆41Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year