divkakwani / awesome-newspapersLinks
A Directory of Online Newspaper Sources for 70+ Languages
☆31Updated 4 years ago
Alternatives and similar repositories for awesome-newspapers
Users that are interested in awesome-newspapers are comparing it to the libraries listed below
Sorting:
- Anonymization of legal cases (Fr) based on Flair embeddings☆87Updated 4 years ago
- A spaCy wrapper for DBpedia Spotlight☆110Updated 2 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆164Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆32Updated 7 months ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 7 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆260Updated last month
- ☆64Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆147Updated 10 months ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- Implementation of the ClausIE information extraction system for python+spacy☆224Updated 3 years ago
- Plan and train German transformer models.☆23Updated 4 years ago
- PYthon Automated Term Extraction☆316Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- Extract dates from text☆65Updated 4 years ago
- A python module for English lemmatization and inflection.☆272Updated 2 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆254Updated 2 years ago
- Live survey of off-the-shelf language identification tools for python☆26Updated 3 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆176Updated 4 months ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆152Updated last week
- Text tokenization and sentence segmentation (segtok v2)☆206Updated 3 years ago
- A Named-Entity Recogniser based on Grobid.☆54Updated 4 months ago
- Unreliable News Index (for Columbia Journalism Review)☆56Updated 3 years ago
- 🕸 GlotWeb: Web Indexing for Low-Resource Languages -- under construction.☆15Updated last month
- Information extraction from English and German texts based on predicate logic☆138Updated 2 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆103Updated last month
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Language independent truecaser in Python.☆160Updated 3 years ago