LanguageMachines / CLIN28_ST_spelling_correction
Scripts that were used for preparing and converting the Wikipedia documents that are part of the CLIN28 shared task on spelling correction
☆10Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for CLIN28_ST_spelling_correction
- CoNLL 2018 Shared Task Team UDPipe-Future☆39Updated 4 years ago
- Decoding platform for machine translation research☆54Updated 5 years ago
- numeric fused-head identification and resolution☆33Updated 5 years ago
- A simple neural truecaser written in pytorch and allennlp.☆32Updated 5 months ago
- Labeled examples from wiki dumps in Python☆68Updated 8 years ago
- CONLL-U to Pandas DataFrame☆31Updated 7 years ago
- ☆10Updated 7 years ago
- The Attract-Repel algorithm presented in (Mrkšić et al., TACL 2017), with accompanying resources.☆64Updated 7 years ago
- Doing things with embeddings☆64Updated 2 years ago
- Python library for vector space models☆13Updated 6 years ago
- Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation☆63Updated 6 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtities☆112Updated 2 years ago
- Sume is an implementation of the concept-based ILP model for summarization.☆38Updated 6 years ago
- COMBO is jointly trained tagger, lemmatizer and dependency parser.☆36Updated last year
- Wrapper to use syntaxnet with pre-trained model☆29Updated 6 years ago
- Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies☆68Updated 5 years ago
- ☆43Updated 9 years ago
- CrowdTruth framework for crowdsourcing ground truth for training & evaluation of AI systems☆57Updated 7 months ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 8 years ago
- Deep-learning based sentence auto-segmentation from unstructured text w/o punctuation☆37Updated 7 years ago
- Exploring Neural Text Simplification☆73Updated 6 years ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆12Updated 4 months ago
- LSTM Language Model with Subword Units Input Representations☆43Updated 3 years ago
- State-of-the-art Supervised Sentence Simplification System from ACL 2014☆47Updated 6 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆90Updated 5 years ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Updated last year
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- A compound splitter based on the semantic regularities in the vector space of word embeddings.☆16Updated 7 years ago
- Context Encoders (ConEc) as a simple but powerful extension of the word2vec model for learning word embeddings☆20Updated 4 years ago
- Code and data for segmentation experiments.☆22Updated 9 years ago