Code for generating the JuICe dataset.
☆37Oct 27, 2021Updated 4 years ago
Alternatives and similar repositories for JuICe
Users that are interested in JuICe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A repository containing the Jupyter notebook code generation benchmark.☆59Feb 9, 2022Updated 4 years ago
- JEMMA: An Extensible Java dataset for Many ML4Code Applications☆19Dec 12, 2022Updated 3 years ago
- Django Dataset for Code Translation Tasks☆31Feb 21, 2018Updated 8 years ago
- Repository of the paper 'CodeQueries: A Dataset of Semantic Queries over Code' published in ISEC 2024☆13Apr 21, 2024Updated last year
- Code for "CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning" (WWW 2019)☆37Apr 21, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- StaQC: a systematically mined dataset containing around 148K Python and 120K SQL domain question-code pairs, as described in "StaQC: A Sy…☆172Aug 28, 2021Updated 4 years ago
- Code for the ICLR 2019 paper "Learning to Represent Edits"☆13Dec 8, 2022Updated 3 years ago
- Code and data for ACL20 paper "Incorporating External Knowledge through Pre-training for Natural Language to Code Generation"☆97Sep 22, 2025Updated 6 months ago
- Lyra: A Benchmark for Turducken-Style Code Generation☆15Apr 22, 2022Updated 3 years ago
- ☆15Oct 26, 2021Updated 4 years ago
- playing with gpt4☆14Mar 17, 2023Updated 3 years ago
- Proof Object Transformation, Preserving Imp Embeddings: the first proof compiler to be formally proven correct☆17Aug 19, 2024Updated last year
- PyTorch library for synthesizing programs from natural language☆18Jul 25, 2024Updated last year
- Codes for "NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer" (ACL 2021 findings)☆15Nov 3, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- NAACL 2018 Tutorial: Modelling Natural Language, Programs, and their Intersection☆101May 31, 2018Updated 7 years ago
- ☆54Aug 25, 2023Updated 2 years ago
- Preprocessed Python functions and docstrings for automated code documentation (code2doc) and automated code generation (doc2code) tasks.☆211Jul 13, 2020Updated 5 years ago
- ☆19Dec 8, 2022Updated 3 years ago
- Restoring Execution Environments of Jupyter Notebooks☆21May 29, 2023Updated 2 years ago
- EVIL (Exploiting software VIa natural Language) is an approach to automatically generate software exploits in assembly/Python language fr…☆29Mar 8, 2022Updated 4 years ago
- The TacTok automated Coq proof script synthesis tool☆17Jan 9, 2024Updated 2 years ago
- EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering☆68Nov 26, 2021Updated 4 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- "Semantic Evaluation for Text-to-SQL with Distilled Test Suite", EMNLP2020☆42Dec 1, 2020Updated 5 years ago
- Pragmatic models for generating and following instructions☆13Dec 22, 2019Updated 6 years ago
- A corpus of Python programs annotated with contracts☆25Oct 16, 2025Updated 5 months ago
- Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"☆20Nov 12, 2021Updated 4 years ago
- [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".☆267Oct 30, 2024Updated last year
- Learning to Model Editing Processes☆26Aug 3, 2025Updated 7 months ago
- Sketch Driven Regular Expression Generation.☆16Apr 26, 2023Updated 2 years ago
- ☆25May 21, 2018Updated 7 years ago
- Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.☆59Jul 31, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆13Jun 14, 2016Updated 9 years ago
- A toolkit for pre-processing large source code corpora☆45Sep 30, 2022Updated 3 years ago
- ☆22Jan 22, 2026Updated 2 months ago
- Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlie…☆84Mar 24, 2023Updated 3 years ago
- General Java utilities (options parser, logging, experiment management, probability/statistics)☆36Mar 9, 2020Updated 6 years ago
- Author implementation of Global Reasoning over Database Structures for Text-to-SQL Parsing☆70Jul 25, 2024Updated last year
- This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …☆15Aug 31, 2021Updated 4 years ago