Aider-AI / refactor-benchmark
Aider's refactoring benchmark exercises based on popular python repos
☆64Updated 6 months ago
Alternatives and similar repositories for refactor-benchmark:
Users that are interested in refactor-benchmark are comparing it to the libraries listed below
- Coding problems used in aider's polyglot benchmark☆93Updated 3 months ago
- Harness used to benchmark aider against SWE Bench benchmarks☆69Updated 9 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆33Updated last year
- ☆38Updated last year
- ☆73Updated last year
- Simple Graph Memory for AI applications☆84Updated 8 months ago
- proof-of-concept of Cursor's Instant Apply feature☆76Updated 7 months ago
- A better way of testing, inspecting, and analyzing AI Agent traces.☆33Updated this week
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆60Updated 8 months ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆145Updated this week
- Code interpreter support for o1☆32Updated 6 months ago
- LLM finetuning☆42Updated last year
- ☆14Updated 2 months ago
- A framework for evaluating function calls made by LLMs☆37Updated 8 months ago
- Agent computer interface for AI software engineer.☆58Updated this week
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 2 months ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆77Updated 6 months ago
- One Line To Build Zero-Data Classifiers in Minutes☆36Updated 6 months ago
- ☆32Updated last year
- ☆22Updated 9 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆86Updated 3 weeks ago
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆122Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 8 months ago
- ☆154Updated 7 months ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆51Updated last year
- Track the progress of LLM context utilisation☆54Updated 8 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆46Updated 6 months ago
- ☆48Updated last year
- auto fine tune of models with synthetic data☆75Updated last year