ArdentAILabs / DE-BenchLinks
DE Bench: Can Agents Solve Real-World Data Engineering Problems? Built to test Ardent's AI Data Engineer
☆32Updated last month
Alternatives and similar repositories for DE-Bench
Users that are interested in DE-Bench are comparing it to the libraries listed below
Sorting:
- ☆176Updated last week
- ☆92Updated last year
- Globot is an agent that controls your browser using playwright and GPT-4V.☆134Updated 2 years ago
- Turn a Github Repo's contents into a big prompt for long-context models like Claude 3 Opus.☆218Updated 11 months ago
- Memory library for building stateful agents☆279Updated this week
- Train a language model to answer Slack messages as you.☆258Updated 10 months ago
- Python SDK for running evaluations on LLM generated responses☆295Updated 7 months ago
- Personal memory for AI☆58Updated last year
- Scrapybara Python SDK☆73Updated 4 months ago
- Work with web-enabled agents quickly — whether running a quick task or bootstrapping a full-stack product.☆92Updated last year
- Letting Claude Code develop his own MCP tools :)☆123Updated 10 months ago
- The Official Exa Python Package☆189Updated this week
- A toolkit for building computer use AI agents☆182Updated 7 months ago
- The Showdown Computer Control Evaluation Suite☆93Updated 9 months ago
- Fluid Database☆113Updated last year
- Code for "Chat with your data using OpenAI, Pinecone, Airbyte and Langchain" tutorial☆37Updated 2 years ago
- Routing on Random Forest (RoRF)☆239Updated last year
- A simple Python sandbox for helpful LLM data agents☆304Updated last year
- A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.☆169Updated last month
- a minimalistic template for dynamic self-building AI agents☆96Updated last year
- ☆198Updated last year
- A non-official CLI for Llama Index Parser☆216Updated last year
- An Open Source Playground with Agent Datasets and APIs for building and testing your own Autonomous Web Agents☆200Updated 2 years ago
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆124Updated 10 months ago
- Get ready for that YC interview☆39Updated last year
- Inference-time scaling for LLMs-as-a-judge.☆326Updated 2 months ago
- Prompt engineering, automated.☆352Updated 9 months ago
- ☆103Updated last month
- ☆137Updated 2 years ago
- Put your data somewhere you can look at it☆28Updated 7 months ago