parkervg / news-article-clustering
A document similarity project attempting to cluster news stories covering identical events.
☆25Updated 4 years ago
Alternatives and similar repositories for news-article-clustering:
Users that are interested in news-article-clustering are comparing it to the libraries listed below
- This repository provides usage examples for the Python module Newspaper3k.☆146Updated last year
- Implementation of the ClausIE information extraction system for python+spacy☆222Updated 2 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆517Updated 5 months ago
- A Python Package which helps to scrape all news details from any news websites☆195Updated 5 months ago
- GetOldTweets-Python is a project written in Python to mine old and backdated tweets, It bypasses some limitations/restrictions of the Twi…☆126Updated last year
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆217Updated last year
- PYthon Automated Term Extraction☆311Updated 2 years ago
- An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo☆276Updated last year
- SKILLSPAN: Competences as Spans for Skill Extraction from Job Postings☆60Updated last month
- A python utility for downloading Common Crawl data☆237Updated last year
- Google USE (Universal Sentence Encoder) for spaCy☆183Updated 2 years ago
- Cleans Reddit Text Data☆81Updated 4 years ago
- Keyword extraction using TextRank algorithm after pre-processing the text with lemmatization, filtering unwanted parts-of-speech and othe…☆114Updated 5 years ago
- A Python scraper for the Facebook Ad Library, using the official Facebook Ad Library API.☆119Updated 5 years ago
- ☆159Updated 2 years ago
- LexRank algorithm for text summarization☆231Updated 11 months ago
- A Python library for calculating a large variety of metrics from text☆334Updated 3 months ago
- Data Processing and Machine learning methods for the Open Skills Project☆170Updated 4 months ago
- A spaCy wrapper for DBpedia Spotlight☆109Updated 2 years ago
- a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sen…☆230Updated 2 years ago
- Article extraction benchmark: dataset and evaluation scripts☆309Updated 11 months ago
- Steam review texting embedding analysis☆141Updated 2 years ago
- Get data about companies from advanced search without the use of API☆62Updated 5 years ago
- Repository for TweetEval☆367Updated 2 years ago
- The dataset used to evaluate JobBERT on the task of job title normalization.☆26Updated 2 years ago
- 🏖TagEditor - Annotation tool for spaCy☆192Updated 2 years ago
- A tool to automatically summarize documents abstractively using the BART or PreSumm Machine Learning Model.☆67Updated 4 years ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…☆149Updated last year
- Scrape news articles and analyze them using NLP to quantify the gender gap in Canadian mainstream media☆42Updated 11 months ago
- A Dataset of German Legal Documents for Named Entity Recognition☆166Updated 2 years ago