github / CodeSearchNet
Datasets, tools, and benchmarks for representation learning of code.
☆2,299Updated 3 years ago
Alternatives and similar repositories for CodeSearchNet
Users that are interested in CodeSearchNet are comparing it to the libraries listed below
Sorting:
- CodeXGLUE☆1,670Updated last year
- CodeBERT☆2,503Updated last year
- TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"☆1,132Updated last year
- DeepCS: Deep Code Search☆279Updated 2 years ago
- Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"☆557Updated 9 months ago
- Conditional Transformer Language Model for Controllable Generation☆1,884Updated 2 weeks ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,353Updated last year
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the…☆2,032Updated 9 months ago
- Papers & presentation materials from Hugging Face's internal science day☆2,047Updated 4 years ago
- Website for "A Survey of Machine Learning for Big Code and Naturalness"☆291Updated 3 months ago
- 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production☆9,670Updated last month
- Public release of the TransCoder research project https://arxiv.org/pdf/2006.03511.pdf☆1,709Updated 3 years ago
- Dataset of GPT-2 outputs for research in detection, biases, and more☆1,981Updated last year
- A library for mining of path-based representations of code (and more)☆287Updated last year
- Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from d…☆749Updated last year
- This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty.☆1,874Updated 4 months ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages☆7,461Updated 2 weeks ago
- Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"☆488Updated 2 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,906Updated 2 years ago
- Phrase-Based & Neural Unsupervised Machine Translation☆1,504Updated 3 years ago
- Code and model for the paper "Improving Language Understanding by Generative Pre-Training"☆2,208Updated 6 years ago
- A natural language modeling framework based on PyTorch☆6,327Updated 2 years ago
- Trax — Deep Learning with Clear Code and Speed☆8,204Updated last month
- The Abstraction and Reasoning Corpus☆4,402Updated last month
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆292Updated last month
- Shared repository for open-sourced projects from the Google AI Language team.☆1,669Updated 2 weeks ago
- The Natural Language Decathlon: A Multitask Challenge for NLP☆2,349Updated 2 weeks ago
- Home of CodeT5: Open Code LLMs for Code Understanding and Generation☆2,991Updated last year
- This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX☆1,587Updated last week
- Large datasets for conversational AI☆1,336Updated 5 years ago