zorazrw / odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆42Updated 8 months ago
Related projects: ⓘ
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆73Updated 5 months ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆69Updated 2 years ago
- code for "Natural Language to Code Translation with Execution"☆39Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆45Updated 6 months ago
- Language Models of Code are Few-Shot Commonsense Learners (EMNLP 2022)☆85Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆76Updated last year
- InstructCoder (former name:Codelnstruct) enables LLMs to edit code☆47Updated 6 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆81Updated 2 weeks ago
- ☆20Updated last week
- [ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks☆51Updated last year
- ☆73Updated last year
- ☆16Updated last month
- [ICML'24] TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks☆20Updated 7 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆58Updated 5 months ago
- Dataset and code for Findings of EMNLP'21 paper "CodeQA: A Question Answering Dataset for Source Code Comprehension".☆37Updated 8 months ago
- A unified benchmark for math reasoning☆87Updated last year
- Repository for paper Tools Are Instrumental for Language Agents in Complex Environments☆32Updated 8 months ago
- Pseudo-code Instructions dataset☆23Updated 9 months ago
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆21Updated last year
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆41Updated last month
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆37Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- A zero-shot neural semantic parser without using annotated parallel training data.☆8Updated 2 years ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- ☆43Updated 11 months ago
- Supporting code for ReCEval paper☆26Updated this week
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆28Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆51Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences☆62Updated 2 months ago
- ☆25Updated last month