princeton-nlp / SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
☆1,766Updated 2 weeks ago
Related projects: ⓘ
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-b…☆2,637Updated this week
- ☆1,517Updated last week
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,393Updated last month
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆1,958Updated 4 months ago
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆2,884Updated last month
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,378Updated 2 months ago
- To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x com…☆4,435Updated 3 weeks ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,739Updated last week
- AIOS: LLM Agent Operating System☆3,219Updated this week
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆1,120Updated 3 weeks ago
- A unified evaluation framework for large language models☆2,375Updated last week
- A framework for prompt tuning using Intent-based Prompt Calibration☆2,038Updated this week
- Tools for merging pretrained large language models.☆4,501Updated this week
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,459Updated last week
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,317Updated 2 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆2,817Updated 2 weeks ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆663Updated 3 weeks ago
- SGLang is a fast serving framework for large language models and vision language models.☆5,121Updated this week
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild☆3,900Updated 2 months ago
- A Native-PyTorch Library for LLM Fine-tuning☆3,942Updated this week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,531Updated last month
- ☆1,705Updated this week
- Build resilient language agents as graphs.☆5,662Updated this week
- HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across externa…☆1,237Updated last month
- Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"☆2,059Updated 3 months ago
- ☆2,652Updated this week
- Automated Design of Agentic Systems☆846Updated 3 weeks ago
- Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, L…☆1,692Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆955Updated last month
- Harness LLMs with Multi-Agent Programming☆2,293Updated this week