gouwsmeister / TextCleanserView external linksLinks
Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".
☆63Oct 1, 2015Updated 10 years ago
Alternatives and similar repositories for TextCleanser
Users that are interested in TextCleanser are comparing it to the libraries listed below
Sorting:
- Open-source implementation of the BilBOWA (Bilingual Bag-of-Words without Alignments) word embedding model.☆69Jul 28, 2021Updated 4 years ago
- A real-time document recommendation system for speech streams☆19Jul 11, 2018Updated 7 years ago
- pronunciation LEXicons for Any Low-resource Language☆21Jul 14, 2020Updated 5 years ago
- ☆56Aug 21, 2018Updated 7 years ago
- ☆21Apr 4, 2015Updated 10 years ago
- A set of tools for performing Labeled Latent Dirichlet Allocation on textual datasets, with an emphasis on Twitter profiles. Contains too…☆42Dec 3, 2021Updated 4 years ago
- Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind…☆78Jun 15, 2019Updated 6 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Automatic Entity Recognition and Typing for Domain-Specific Corpora (KDD'15)☆99Jul 7, 2017Updated 8 years ago
- An interactive map of English words, where words with similar meaning appear closer together.☆22Feb 12, 2015Updated 11 years ago
- ASR transcription and SLU annotation web interface for call logs collected at UFAL-DSG.☆11Dec 8, 2014Updated 11 years ago
- Lazy python recipes.☆10Apr 17, 2021Updated 4 years ago
- A Multilingual and Multilevel Representation Learning Toolkit for NLP☆117Feb 14, 2018Updated 8 years ago
- Python port of Mikolov's word2phrase.c from the word2vec toolkit☆111Apr 1, 2020Updated 5 years ago
- This repo contain the exercies of the Next.ML 2015 presentation☆24Jan 17, 2015Updated 11 years ago
- Workshop on Noisy User-generated Text (W-NUT)☆30May 7, 2025Updated 9 months ago
- Instructions for deploying Kubeflow on EKS and minikube☆15Jun 25, 2021Updated 4 years ago
- For FFL Blog☆10Sep 24, 2015Updated 10 years ago
- Deep learning spelling patterns with a recurrent neural network☆12Jun 5, 2017Updated 8 years ago
- 📄Neural Sentential Paraphrase Generation to Augment Chatbot Training Dataset☆21Dec 7, 2022Updated 3 years ago
- A framework to compare entity linking systems.☆38Jul 29, 2018Updated 7 years ago
- NYT Risk Semantics Project☆12Mar 5, 2016Updated 9 years ago
- Document exploration tool☆12Sep 6, 2016Updated 9 years ago
- Implementation of an algorithm computing the nearest "N" neighbours to a vector, using a collection of hyperplane hashers.☆30Jul 17, 2015Updated 10 years ago
- Natural Language Inference Dataset Generation☆29Jul 21, 2016Updated 9 years ago
- Scraps of random machine learning code☆15Oct 19, 2016Updated 9 years ago
- Links parts of input text to Wikipedia articles☆16Sep 9, 2012Updated 13 years ago
- ☆15Apr 15, 2016Updated 9 years ago
- Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP☆15Oct 12, 2016Updated 9 years ago
- PolEval 2021 Task 1☆15Jun 28, 2022Updated 3 years ago
- a latex cheat sheet with ipython commands and shortcuts☆10Mar 10, 2014Updated 11 years ago
- An attentional NMT model in Dynet☆26Dec 5, 2018Updated 7 years ago
- ☆12Dec 9, 2015Updated 10 years ago
- Master's thesis project in collaboration with Rasa, focusing on knowledge distillation from BERT into different very small networks and a…☆13Sep 30, 2022Updated 3 years ago
- probabilistic language corrector based on google ngrams☆21May 31, 2011Updated 14 years ago
- Deep-Learning Model Exploration and Development for NLP☆245Oct 13, 2023Updated 2 years ago
- Code for "Bayesian Online Changepoint Detection" (Adams and MacKay, 2007).☆21Jul 9, 2012Updated 13 years ago
- Sense Disambiguation of Connectives for PDTB-Style Discourse Parsing☆14Jan 13, 2017Updated 9 years ago
- IXA pipes Part of Speech tagger and Lemmatizer (http://ixa2.si.ehu.es/ixa-pipes)☆18Nov 18, 2022Updated 3 years ago