microsoft / JigsawDataset
Jigsaw Dataset: Natural language to Python Pandas code
☆53Updated last year
Related projects ⓘ
Alternatives and complementary repositories for JigsawDataset
- ☆75Updated last year
- Semantic Code Search☆34Updated last year
- A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.☆52Updated 2 years ago
- Official code release for the paper Coder Reviewer Reranking for Code Generation.☆42Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆44Updated 11 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆100Updated 3 weeks ago
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆105Updated last year
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆37Updated last year
- [EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code☆69Updated 5 months ago
- Reasoning by Communicating with Agents☆21Updated last month
- Script for downloading GitHub.☆88Updated 4 months ago
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆79Updated last year
- Two Automatic code completion IDE extensions for @JetBrains and @microsoft/vscode based on Transformer-based large language models for so…☆55Updated 8 months ago
- ☆54Updated 6 months ago
- A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.☆13Updated 2 years ago
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 4 months ago
- Models and datasets for annotated code search.☆33Updated last year
- Code Generator☆23Updated last year
- Graph-based method for end-to-end code completion with context awareness on repository☆47Updated 2 months ago
- Training language models to make programs faster☆83Updated 7 months ago
- ☆29Updated last year
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆71Updated 2 years ago
- This is the repository for the paper Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descripti…☆25Updated 2 years ago
- VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning☆38Updated last year
- ☆21Updated 3 weeks ago
- PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. W…☆87Updated 2 years ago
- Learning to Program with Natural Language☆5Updated 11 months ago
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆36Updated last year
- Generative model for code infilling and synthesis☆296Updated last year
- CodeMind is a generic framework for evaluating inductive code reasoning of LLMs. It is equipped with a static analysis component that ena…☆33Updated 3 months ago