thesofakillers / aideml
AIDE: the Machine Learning CodeGen Agent
☆21Updated 3 months ago
Alternatives and similar repositories for aideml:
Users that are interested in aideml are comparing it to the libraries listed below
- Beating the GAIA benchmark with Transformers Agents. 🚀☆78Updated 3 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆41Updated 3 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆98Updated 4 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 3 months ago
- Simple examples using Argilla tools to build AI☆52Updated 2 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆66Updated 7 months ago
- Evaluating LLMs with CommonGen-Lite☆88Updated 10 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- Leverage your LangChain trace data for fine tuning☆40Updated 5 months ago
- ☆76Updated 7 months ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.☆91Updated 5 months ago
- ☆39Updated last month
- 🔔🧠Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆51Updated this week
- ☆47Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 4 months ago
- ☆138Updated 6 months ago
- Dynamic Metadata based RAG Framework☆71Updated 6 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆69Updated 3 months ago
- AWM: Agent Workflow Memory☆233Updated 2 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆54Updated 3 months ago
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆61Updated this week
- Writing Blog Posts with Generative Feedback Loops!☆47Updated 10 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆156Updated 3 months ago
- Just a bunch of benchmark logs for different LLMs☆117Updated 6 months ago
- Routing on Random Forest (RoRF)☆100Updated 4 months ago
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆54Updated 2 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated last month
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated last month
- Official homepage for "Self-Harmonized Chain of Thought"☆89Updated last week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆100Updated last month