csebuetnlp/CoDesc

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/csebuetnlp/CoDesc)

csebuetnlp / CoDesc

A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.

☆55

Alternatives and similar repositories for CoDesc

Users that are interested in CoDesc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

code-desc / CoDesc
View on GitHub
A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.
☆15Feb 24, 2022Updated 4 years ago
BASE-LAB-SJTU / CosBench
View on GitHub
A dataset for natural language code search.
☆14Feb 13, 2020Updated 6 years ago
nokia / codesearch
View on GitHub
Models and datasets for annotated code search.
☆37May 22, 2023Updated 3 years ago
chengjunyan1 / GN-Transformer-AST
View on GitHub
Official repository for the paper "GN-Transformer: Fusing AST and Source Code information in Graph Networks".
☆17May 25, 2025Updated last year
rizwan09 / REDCODER
View on GitHub
☆44Jun 24, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Jun-jie-Huang / CoCLR
View on GitHub
Source Code for ACL-21 main conference paper "CoSQA: 20,000+ Web Queries for Code Search and Question Answering".
☆47Nov 2, 2022Updated 3 years ago
v587su / NLQF
View on GitHub
NLQF is a tool to filter query-appropriate comments for building high-quality code search datasets.
☆19Feb 15, 2022Updated 4 years ago
microsoft / CodeXGLUE
View on GitHub
CodeXGLUE
☆1,832Apr 23, 2024Updated 2 years ago
sriniiyer / concode
View on GitHub
Mapping Language to Code in a Programmatic Context
☆80Jan 27, 2021Updated 5 years ago
zetang94 / ASE2023_kNM-LM
View on GitHub
This is the official implement for the paper 'Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases''
☆14Oct 4, 2023Updated 2 years ago
wasiahmad / PLBART
View on GitHub
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
☆186Mar 1, 2022Updated 4 years ago
ise-uiuc / NablaFuzz
View on GitHub
Fuzzing Automatic Differentiation in Deep-Learning Libraries (ICSE'23)
☆27Mar 2, 2024Updated 2 years ago
saketdingliwal / Few-Shot-DST
View on GitHub
Source code for our "D-REPTILE" paper at EACL 2021
☆13Jan 19, 2021Updated 5 years ago
ddemszky / conversational-uptake
View on GitHub
Code and data for the paper "Measuring Conversational Uptake: A Case-Study on Student-Teacher Interactions"
☆25Apr 24, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
facebookresearch / Neural-Code-Search-Evaluation-Dataset
View on GitHub
evaluation dataset consisting of natural language query and code snippet pairs
☆125May 3, 2024Updated 2 years ago
LIANGQINGYUAN / Lyra
View on GitHub
Lyra: A Benchmark for Turducken-Style Code Generation
☆15Apr 22, 2022Updated 4 years ago
sled-group / MindCraft
View on GitHub
Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks
☆21May 18, 2023Updated 3 years ago
justinphan3110 / CoTexT
View on GitHub
Code implementation for CoTexT: Multi-task Learning with Code-Text Transformer
☆36Sep 14, 2021Updated 4 years ago
microsoft / Search4Code
View on GitHub
Web queries dataset for code search
☆32Jun 3, 2023Updated 3 years ago
csebuetnlp / xl-sum
View on GitHub
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 4…
☆277Mar 26, 2024Updated 2 years ago
waingram / code-embeddings
View on GitHub
A Comparative Study of Various Code Embeddings in Software Semantic Matching
☆17Dec 8, 2022Updated 3 years ago
jianguda / mrncs
View on GitHub
☆23Mar 25, 2023Updated 3 years ago
rajarshihaldar / codetextmatch
View on GitHub
☆19Dec 8, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
microsoft / ReACC
View on GitHub
Source codes for paper ”ReACC: A Retrieval-Augmented Code Completion Framework“
☆67Apr 18, 2022Updated 4 years ago
FalconLK / facoy
View on GitHub
FaCoY Code-to-Code Search Engine
☆34Jan 18, 2019Updated 7 years ago
adf1178 / PT4Code
View on GitHub
☆49Jul 24, 2022Updated 4 years ago
mooselab / SDLog
View on GitHub
Sensitivity detector in software logs
☆17Apr 9, 2026Updated 3 months ago
SparkDevF19 / ai-program-translation
View on GitHub
Program Translator AI built on Pytorch
☆15Dec 19, 2019Updated 6 years ago
EdinburghNLP / code-docstring-corpus
View on GitHub
Preprocessed Python functions and docstrings for automated code documentation (code2doc) and automated code generation (doc2code) tasks.
☆211Jul 13, 2020Updated 6 years ago
LittleYUYU / StackOverflow-Question-Code-Dataset
View on GitHub
StaQC: a systematically mined dataset containing around 148K Python and 120K SQL domain question-code pairs, as described in "StaQC: A Sy…
☆170Aug 28, 2021Updated 4 years ago
FSoft-AI4Code / DocChecker
View on GitHub
DocChecker: Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment
☆15Jan 23, 2024Updated 2 years ago
hrk623 / record-hunter
View on GitHub
A lightning component bundle for searching records on salesforce platform.
☆15Jul 9, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cqu-isse / CARLCS-CNN
View on GitHub
☆11Jul 25, 2020Updated 6 years ago
microsoft / JigsawDataset
View on GitHub
Jigsaw Dataset: Natural language to Python Pandas code
☆55Dec 18, 2022Updated 3 years ago
sujunyan / tex-gallery
View on GitHub
☆14Dec 4, 2023Updated 2 years ago
danhper / bigcode-tools
View on GitHub
Set of tools to help working with "Big Code"
☆42Apr 28, 2022Updated 4 years ago
LittleYUYU / CoaCor
View on GitHub
Code for "CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning" (WWW 2019)
☆37Apr 21, 2020Updated 6 years ago
ypapanik / t5-for-code-generation
View on GitHub
Semantic Parsing with text-to-text Transformers
☆21Jan 19, 2021Updated 5 years ago
nicbet / infozilla
View on GitHub
The infoZilla unstructured software engineering data mining tool. It can find and extract source code regions, patches, stack traces, enu…
☆15Jan 24, 2019Updated 7 years ago