FSoft-AI4Code / HyperAgent
Generalist Software Agents to Solve Soware Engineering Tasks
☆176Updated this week
Related projects ⓘ
Alternatives and complementary repositories for HyperAgent
- EasyCuda: A Comprehensive Guide to Accelerated and Parallel Programming with CUDA and C/C++☆9Updated 9 months ago
- Language Model for Mainframe Modernization☆42Updated 2 months ago
- Graph-based method for end-to-end code completion with context awareness on repository☆44Updated 2 months ago
- Benchmark for Repository-Level Code Generation, focus on Executability, Correctness from Test Cases and Usage of Contexts from Cross-file…☆19Updated 4 months ago
- [ACL 2024] Novel reranking method to select the best solutions for code generation☆14Updated 5 months ago
- Predicting Program Behavior with Dynamic Dependencies Learning☆23Updated 2 months ago
- Open-source Self-Instruction Tuning Code LLM☆168Updated last year
- ☆255Updated last month
- [EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation☆84Updated 2 months ago
- ⚒️ Tree-sitter custom toolkit for extracting function and class from raw source file☆39Updated 4 months ago
- 🚀 CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.☆13Updated 2 weeks ago
- ☆152Updated 2 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆109Updated 4 months ago
- Enhancing AI Software Engineering with Repository-level Code Graph☆92Updated 2 months ago
- AWM: Agent Workflow Memory☆203Updated last month
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"☆447Updated 7 months ago
- ☆65Updated 2 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆62Updated this week
- ☆102Updated 2 months ago
- Harness used to benchmark aider against SWE Bench benchmarks☆52Updated 4 months ago
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆72Updated 10 months ago
- ☆116Updated 5 months ago
- A new benchmark for measuring LLM's capability to detect bugs in large codebase.☆27Updated 5 months ago
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generation☆93Updated this week
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆187Updated this week
- r2e: turn any github repository into a programming agent environment☆88Updated last week
- Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://hugging…☆138Updated 3 weeks ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆82Updated 2 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆194Updated 6 months ago