NEUIR / INTERVENOR
Source code for paper: INTERVENOR : Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing
☆26Updated 4 months ago
Alternatives and similar repositories for INTERVENOR:
Users that are interested in INTERVENOR are comparing it to the libraries listed below
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated 11 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆52Updated 9 months ago
- NaturalCodeBench (Findings of ACL 2024)☆62Updated 5 months ago
- ☆60Updated 3 months ago
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆58Updated 5 months ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆47Updated last year
- ☆45Updated 10 months ago
- Training and Benchmarking LLMs for Code Preference.☆33Updated 4 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆137Updated 5 months ago
- A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.☆51Updated 5 months ago
- ☆24Updated 8 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆91Updated last month
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆84Updated last month
- ☆74Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆64Updated 6 months ago
- ☆124Updated last year
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆55Updated 11 months ago
- ☆68Updated last year
- Large Language Models Meet NL2Code: A Survey☆36Updated 4 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆118Updated last year
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆78Updated 6 months ago
- ☆67Updated last year
- [IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection☆86Updated 10 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆150Updated last year
- Repoformer: Selective Retrieval for Repository-Level Code Completion (ICML 2024)☆54Updated 8 months ago
- Advancing LLM with Diverse Coding Capabilities☆69Updated 8 months ago
- ☆49Updated 8 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆104Updated 5 months ago