divkakwani / awesome-newspapers
A Directory of Online Newspaper Sources for 70+ Languages
☆32Updated 4 years ago
Alternatives and similar repositories for awesome-newspapers
Users that are interested in awesome-newspapers are comparing it to the libraries listed below
Sorting:
- Extract dates from text☆64Updated 4 years ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆88Updated 4 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆126Updated 4 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- A spaCy wrapper for DBpedia Spotlight☆109Updated 2 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆161Updated 2 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 7 months ago
- 🧪 Cutting-edge experimental spaCy components and features☆98Updated last year
- Language Tool style grammar handling with spaCy 2.0☆42Updated 6 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 4 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 4 years ago
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- Topic Inference with Zeroshot models☆61Updated last year
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆60Updated 4 years ago
- ☆18Updated 3 years ago
- A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, …☆34Updated 6 years ago
- 📂 Additional lookup tables and data resources for spaCy☆105Updated 3 months ago
- a python package for cleaning Gutenberg books and dataset☆35Updated 2 weeks ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆158Updated this week
- Code for extracting parallel corpora from pmindia☆16Updated 5 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆142Updated 5 months ago
- ☆64Updated 2 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆67Updated 3 weeks ago
- Legal document classification with EuroVoc descriptors on 22 languages.☆26Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆93Updated last year
- A character-wise tokenizer for morphologically rich languages☆27Updated 2 months ago
- Open information and community for machine translation☆77Updated 3 weeks ago