divkakwani / awesome-newspapers
A Directory of Online Newspaper Sources for 70+ Languages
☆33Updated 3 years ago
Alternatives and similar repositories for awesome-newspapers:
Users that are interested in awesome-newspapers are comparing it to the libraries listed below
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 4 months ago
- A spaCy wrapper for DBpedia Spotlight☆108Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆157Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated last year
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆93Updated last year
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- Python tools for interacting with Wikidata☆151Updated last year
- A machine learning tool for fishing entities☆258Updated this week
- CrowdTruth framework for crowdsourcing ground truth for training & evaluation of AI systems☆58Updated 10 months ago
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 9 months ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆148Updated last year
- A Named-Entity Recogniser based on Grobid.☆50Updated 5 months ago
- 📂 Additional lookup tables and data resources for spaCy☆101Updated 3 weeks ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆96Updated last year
- CorrectLy - Open Source Spelling & Grammar correction☆40Updated 2 years ago
- Article extraction benchmark: dataset and evaluation scripts☆301Updated 9 months ago
- Resources to go with the Indic NLP Library☆73Updated 2 years ago
- Legal document classification with EuroVoc descriptors on 22 languages.☆25Updated last year
- Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the…☆33Updated 8 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- 🚀GUI for training spaCy models☆54Updated 3 years ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking☆85Updated 2 years ago
- spaCy + UDPipe☆160Updated 2 years ago
- Language independent truecaser in Python.☆160Updated 3 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated last year
- tool for collectively summarizing large discussions☆143Updated 2 years ago
- A multilingual lexicon of words to hurt.☆83Updated 3 months ago