microsoft / TraceCodegen
☆27Updated last year
Alternatives and similar repositories for TraceCodegen:
Users that are interested in TraceCodegen are comparing it to the libraries listed below
- Fault-aware neural code rankers☆28Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- ☆75Updated last month
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆64Updated last year
- A unified benchmark for math reasoning☆87Updated 2 years ago
- ☆36Updated 10 months ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 7 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆60Updated last year
- Generating and validating natural-language explanations for the brain.☆51Updated 3 weeks ago
- ☆28Updated 3 years ago
- ☆44Updated 10 months ago
- Language Models of Code are Few-Shot Commonsense Learners (EMNLP 2022)☆86Updated 2 years ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆47Updated last year
- Repository for analysis and experiments in the BigCode project.☆118Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- ☆16Updated 6 months ago
- Transformers at any scale☆41Updated last year
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆20Updated last year
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆42Updated 6 months ago
- RL algorithm: Advantage induced policy alignment☆65Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆59Updated 6 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- ☆45Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆53Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆86Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 10 months ago
- Llemma formal2formal (tactic prediction) theorem proving experiments☆20Updated last year
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆32Updated 6 months ago