microsoft / JigsawDatasetLinks

Jigsaw Dataset: Natural language to Python Pandas code

☆53

Alternatives and similar repositories for JigsawDataset

Users that are interested in JigsawDataset are comparing it to the libraries listed below

Sorting:

dpfried / incoder
Generative model for code infilling and synthesis
☆304Updated last year
nyu-mll / ILF-for-code-generation
☆78Updated 4 months ago
neulab / code-bert-score
CodeBERTScore: an automatic metric for code generation, based on BERTScore
☆196Updated last year
csebuetnlp / CoDesc
A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.
☆53Updated 3 years ago
agemagician / CodeTrans
Pretrained Language Models for Source code
☆255Updated 4 years ago
terryyz / ice-score
[EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code
☆76Updated last year
EleutherAI / github-downloader
Script for downloading GitHub.
☆96Updated last year
zorazrw / odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆48Updated last year
ntunlp / xCodeEval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
☆86Updated 10 months ago
facebookresearch / coder_reviewer_reranking
Official code release for the paper Coder Reviewer Reranking for Code Generation.
☆45Updated 2 years ago
shuyanzhou / docprompting
Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023
☆248Updated last year
FSoft-AI4Code / RepoHyper
[FORGE 2025] Graph-based method for end-to-end code completion with context awareness on repository
☆64Updated 11 months ago
overwindows / SemanticCodeSearch
Semantic Code Search
☆35Updated 2 years ago
openai / human-eval-infilling
Code for the paper "Efficient Training of Language Models to Fill in the Middle"
☆183Updated 2 years ago
FSoft-AI4Code / TheVault
[EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
☆97Updated 11 months ago
CarperAI / Code-Pile
This repository contains all the code for collecting large scale amounts of code from GitHub.
☆110Updated 2 years ago
facebookresearch / cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆151Updated 9 months ago
reddy-lab-code-research / StructCoder
Code for "StructCoder: Structure-Aware Transformer for Code Generation"
☆76Updated last year
gangiswag / cornstack
☆35Updated last month
michiyasunaga / BIFI
[ICML 2021] Break-It-Fix-It: Unsupervised Learning for Program Repair
☆117Updated 2 years ago
google-research / babelcode
☆52Updated 5 months ago
code4me-me / code4me
Two Automatic code completion IDE extensions for @JetBrains and @microsoft/vscode based on Transformer-based large language models for so…
☆55Updated last year
shrivastavadisha / repo_level_prompt_generation
☆124Updated 2 years ago
reddy-lab-code-research / PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆114Updated last year
google-research / plur
PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. W…
☆87Updated 3 years ago
Zyq-scut / RLTF
Accepted by Transactions on Machine Learning Research (TMLR)
☆130Updated 10 months ago
kandluis / code-gen
Code Generator
☆23Updated 2 years ago
madaan / pie-perf
Training language models to make programs faster
☆91Updated last year
gonglinyuan / ast_t5
☆67Updated last year
justinphan3110 / CoTexT
Code implementation for CoTexT: Multi-task Learning with Code-Text Transformer
☆36Updated 3 years ago