Aider-AI / refactor-benchmarkLinks
Aider's refactoring benchmark exercises based on popular python repos
☆77Updated 10 months ago
Alternatives and similar repositories for refactor-benchmark
Users that are interested in refactor-benchmark are comparing it to the libraries listed below
Sorting:
- Harness used to benchmark aider against SWE Bench benchmarks☆71Updated last year
- Coding problems used in aider's polyglot benchmark☆174Updated 8 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆88Updated 10 months ago
- Simple Graph Memory for AI applications☆89Updated 3 months ago
- proof-of-concept of Cursor's Instant Apply feature☆83Updated 11 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆90Updated this week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆296Updated this week
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated 7 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆84Updated 5 months ago
- ⚡️🧪 Fast LLM Tool Calling Experimentation, big and smol☆150Updated 11 months ago
- A framework for evaluating function calls made by LLMs☆38Updated last year
- ☆159Updated last year
- A Python library to orchestrate LLMs in a neural network-inspired structure☆50Updated 10 months ago
- ☆74Updated last year
- Anthropic Computer Use with Modal Sandboxes☆37Updated 10 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆102Updated last year
- LLM plugin for clustering embeddings☆81Updated last year
- ☆161Updated 3 weeks ago
- Replace expensive LLM calls with finetunes automatically☆64Updated last year
- Agent computer interface for AI software engineer.☆104Updated this week
- Synthetic Data for LLM Fine-Tuning☆120Updated last year
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆106Updated last month
- auto fine tune of models with synthetic data☆76Updated last year
- ☆107Updated 2 years ago
- Just a bunch of benchmark logs for different LLMs☆120Updated last year
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆37Updated last year
- Helper functions to generate JSON schema dicts for OpenAI ChatGPT function calling requests.☆81Updated 5 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆68Updated last year
- various experiments for scaling inference time compute with small reasoning models☆17Updated 7 months ago
- ☆23Updated 6 months ago