flexudy-pipe / sentence-doctor
Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of the art SBD, they often depend on text extractors (e.g pdf text extractors or OCR). The quality of these extractors greatly influence the quality of SBD libraries and as a consequence, the performance of downst…
☆61Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for sentence-doctor
- Zero-shot Transfer Learning from English to Arabic☆29Updated 2 years ago
- On Generating Extended Summaries of Long Documents☆77Updated 3 years ago
- SUPERT: Unsupervised multi-document summarization evaluation & generation☆91Updated last year
- Use BERT to Fill in the Blanks☆82Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.☆32Updated 5 months ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆43Updated 6 months ago
- ☆73Updated 3 years ago
- A repository for our AAAI-2020 Cross-lingual-NER paper. Code will be updated shortly.☆46Updated last year
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆54Updated 2 years ago
- ☆67Updated 4 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆67Updated 3 years ago
- A embed able annotation tool for end to end cross document co-reference☆41Updated last year
- Dynamic ensemble decoding with transformer-based models☆29Updated last year
- QED: A Framework and Dataset for Explanations in Question Answering☆115Updated 3 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆97Updated last year
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆49Updated 4 years ago
- Dual Encoders for State-of-the-art Natural Language Processing.☆61Updated 2 years ago
- Dataset of ML and NLP papers☆35Updated 2 years ago
- Fine-tune transformers with pytorch-lightning☆44Updated 2 years ago
- ☆32Updated 3 years ago
- Coreference resolution with different higher-order inference methods; implemented in PyTorch.☆35Updated last year
- Lexical Simplification with Pretrained Encoders☆69Updated 3 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆65Updated last year
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆85Updated last month
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆63Updated 4 years ago
- Topic Inference with Zeroshot models☆61Updated last year
- Massively Multilingual Transfer for NER☆85Updated 3 years ago
- This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020…☆32Updated 3 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆76Updated 4 months ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆36Updated 2 years ago