A redistributable subset of the ETH Py150 corpus [https://www.sri.inf.ethz.ch/py150], introduced in the ICML 2020 paper 'Learning and Evaluating Contextual Embedding of Source Code' [https://proceedings.icml.cc/static/paper_files/icml/2020/5401-Paper.pdf].
☆32Aug 11, 2020Updated 5 years ago
Alternatives and similar repositories for eth_py150_open
Users that are interested in eth_py150_open are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository of the paper 'CodeQueries: A Dataset of Semantic Queries over Code' published in ISEC 2024☆13Apr 21, 2024Updated last year
- CD4Py: Code De-Duplication for Python☆23Dec 13, 2020Updated 5 years ago
- Code Generation as a Dual Task of Code Summarization.☆30Jun 28, 2021Updated 4 years ago
- Utilities used by the Deep Program Understanding team☆104Jun 12, 2023Updated 2 years ago
- Mining tool and large-scale datasets of single statement bug fixes in Python☆19Nov 29, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Sourc…☆66Dec 3, 2021Updated 4 years ago
- A collection of datasets for machine learning for big code☆62Oct 8, 2021Updated 4 years ago
- ☆14May 31, 2021Updated 4 years ago
- Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems☆16Feb 21, 2023Updated 3 years ago
- Library for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…☆21Oct 22, 2018Updated 7 years ago
- SSRAG 知识库 - 基于 LLM 的开源 AI 知识库平台,涵盖内容管理(CMS)、知识库问答(RAG)、可视化 AI 工作流编排(Workflow)到智能体(Agent)的全面应用☆20Nov 3, 2025Updated 4 months ago
- I/O utilities and datasets for algebraic-graphs☆14Aug 29, 2022Updated 3 years ago
- ☆15Nov 12, 2025Updated 4 months ago
- sliding fast fourier transform using haskell streaming☆13Feb 19, 2019Updated 7 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Artifacts and other data for "Code Vectors: Understanding Programs Through Embedded Abstraced Symbolic Traces"☆22Jun 5, 2020Updated 5 years ago
- Haskell implementation of Glumpy☆12Jun 21, 2021Updated 4 years ago
- ☆28Jan 20, 2026Updated 2 months ago
- Code for "Generative Code Modeling with Graphs" (ICLR'19)☆172Dec 8, 2022Updated 3 years ago
- Haskell numerical ODE solvers☆14Aug 21, 2017Updated 8 years ago
- Code to reproduce the experiments in the paper Open Vocabulary Learning on Source Code with a Graph-Structured Cache☆21Apr 15, 2019Updated 6 years ago
- Data and Code for Reproducing "Global Relational Models of Source Code"☆85May 10, 2021Updated 4 years ago
- Implementation of the paper "Language-agnostic representation learning of source code from structure and context".☆173Apr 6, 2022Updated 3 years ago
- Fine-grained lattice primitives for Haskell☆18Mar 8, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Honeyquest is a cyber security game that asks humans to distinguish neutral, risky, and deceptive payloads. Honeyquest presents participa…☆14Jan 8, 2026Updated 2 months ago
- HaVSA (Have-Saa) is a Haskell implementation of the Version Space Algebra Machine Learning technique described by Tessa Lau.☆12Jul 8, 2017Updated 8 years ago
- A Haskell implementation of distributed hash tables with two-phase commit.☆10Dec 9, 2016Updated 9 years ago
- LibSSH2 FFI bindings for Haskell☆26Apr 3, 2025Updated 11 months ago
- the code for three models introduced in DYNAMIC NEURAL PROGRAM EMBEDDINGS FOR PROGRAM REPAIR (ICLR 18)☆33Jun 30, 2018Updated 7 years ago
- The Elements of Statistical Learning in Haskell☆13Nov 29, 2017Updated 8 years ago
- Official implementation of our work, A Transformer-based Approach for Source Code Summarization [ACL 2020].☆195May 28, 2022Updated 3 years ago
- "Fail Fast" process management for Haskell; inspired by Erlang☆16Jan 19, 2017Updated 9 years ago
- This repo is the benchmark for source code summarization on C language☆26Mar 18, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Source Code for Paper "Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning"☆18Jun 9, 2023Updated 2 years ago
- ☆13Jul 17, 2023Updated 2 years ago
- Code for the paper "Embedding Java Classes with code2vec: Improvements from Variable Obfuscation" in MSR 2020☆32Mar 24, 2023Updated 3 years ago
- A fuzzy string set implementation in Haskell.☆11Mar 8, 2024Updated 2 years ago
- Filling and manupulation with histograms☆17Mar 10, 2025Updated last year
- Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"☆16Oct 24, 2022Updated 3 years ago
- Framework model for static analysis of Android☆46Jul 13, 2016Updated 9 years ago