smola / language-datasetLinks
Dataset for programming language identification.
☆24Updated 2 years ago
Alternatives and similar repositories for language-dataset
Users that are interested in language-dataset are comparing it to the libraries listed below
Sorting:
- Advanced similarity and duplicate source code at scale.☆56Updated 6 years ago
- a contextual search engine for software packages built on import2vec embeddings (https://www.code-compass.com)☆38Updated 6 years ago
- Extract statistics from Wikipedia Dump files.☆26Updated 4 years ago
- In-IDE Code Search☆29Updated 3 years ago
- Background materials for the article "Productivity Assessment of Neural Code Completion"☆12Updated 2 years ago
- 🐈 Code Annotation Tool☆28Updated 6 years ago
- Advanced similarity and duplicate source code proof of concept for our research efforts.☆52Updated 3 years ago
- Text similarity based on Word2Vec vectors.☆10Updated 8 years ago
- Neural Solr = Solr 9 + Mighty Inference + Node☆18Updated 3 years ago
- ☆31Updated 2 years ago
- Interactive SQL analytics in your browser!☆22Updated 7 years ago
- A record and replay system for the browser (renamed Ringer)☆30Updated 8 years ago
- A library of examples showing how to use the Common Crawl corpus (2008-2012, ARC format)☆65Updated 9 years ago
- T5Patches is a set of tools for fast and targeted editing of generative language models built with T5X.☆12Updated last year
- A machine learning software for extracting information from scholarly documents☆23Updated 4 years ago
- Fixes Java syntax errors with LSTM neural networks! [proof-of-concept]☆18Updated 4 years ago
- Indri search implementation on top of Lucene search engine☆35Updated last year
- Neural Elastic Inference and Search☆19Updated 6 years ago
- source{d} MLonCode foundation - core algorithms and models.☆14Updated 6 years ago
- Experiments to help discussion on Wikipedia talk pages☆68Updated last week
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 8 years ago
- Launch NMT tasks on the cloud☆13Updated 2 years ago
- Deep learning spelling patterns with a recurrent neural network☆12Updated 8 years ago
- This module contains an implementation of the Nilsimsa locality-sensitive hashing algorithm in Java.☆18Updated 6 years ago
- An efficient and flexible token-based regular expression language and engine.☆75Updated 11 years ago
- MozoLM: A language model (LM) serving library☆46Updated last month
- The blog post about Kubeflow, including all materials☆31Updated 5 months ago
- Machine Learning for Information Retrieval☆86Updated 6 months ago
- Lightning Fast Language Prediction 🚀☆167Updated 3 months ago
- Extract Data from Wikipedia Tables☆34Updated 8 years ago