kennethkenneth / AuthorExtractor
Source code for the Medium article "Extracting the author of news stories with DOM-based segmentation and BERT"
☆29Updated 4 years ago
Related projects: ⓘ
- Source code for the paper "Web2Text: Deep Structured Boilerplate Removal", full paper @ ECIR'18☆167Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 6 months ago
- Making BERT stretchy. Semantic Elasticsearch with Sentence Transformers☆159Updated 3 years ago
- Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation☆98Updated 2 years ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆87Updated 3 years ago
- Sentence transformers models for SpaCy☆104Updated last year
- The project proposes a framework to apply topic models on a text-corpus and eventually topic labels on the generated topics.☆36Updated 4 months ago
- Exploring the simple sentence similarity measurements using word embeddings☆101Updated last month
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆240Updated last year
- No Teacher BART distillation experiment for NLI tasks☆25Updated 4 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆69Updated last year
- Entity Disambiguation as text extraction (ACL 2022)☆173Updated 2 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆139Updated 2 years ago
- A spaCy wrapper for DBpedia Spotlight☆103Updated last year
- Semantic search using Transformers and others☆110Updated 4 years ago
- ☆67Updated 4 years ago
- ☆73Updated 6 years ago
- An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the t…☆214Updated 2 months ago
- Language Models for Zalando's flair library☆62Updated 4 years ago
- Dataset for the Emerging & Novel Entity NER task (WNUT '17)☆110Updated 2 years ago
- Google USE (Universal Sentence Encoder) for spaCy☆176Updated last year
- Boilerplate Removal using Deep Learning☆80Updated 2 years ago
- ☆91Updated 8 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆149Updated 3 months ago
- Fine-tune BERT to generate sentence embedding for cosine similarity☆70Updated 5 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆151Updated last year
- Few-shot Named Entity Recognition☆121Updated 2 years ago
- Named Entity Recognition based on dictionaries☆242Updated 5 years ago
- Self-Supervision for Named Entity Disambiguation at the Tail☆212Updated 2 years ago
- LASER multilingual sentence embeddings as a pip package☆224Updated last year