mediacloud / nyt-news-labeler
Tag news stories based on models trained on the NYT corpus.
☆42Updated 2 years ago
Alternatives and similar repositories for nyt-news-labeler:
Users that are interested in nyt-news-labeler are comparing it to the libraries listed below
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 6 months ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- Cleans Reddit Text Data☆83Updated 5 years ago
- Examples for getting started using https://case.law☆65Updated 2 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆56Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- Extract dates from text☆64Updated 4 years ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…☆151Updated last year
- Trying to generate name synonyms from wikidata☆32Updated 4 years ago
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆151Updated 3 months ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of diff…☆88Updated 3 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- 🚀GUI for training spaCy models☆54Updated 3 years ago
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- Public client for consuming content from the Media Cloud Online News Archive & Directory.☆72Updated 4 months ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 2 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- A classifier that distinguishes political from non-political news articles.☆30Updated last year
- ☆28Updated 4 years ago
- Measure the readability of a given text using surface characteristics☆79Updated 3 months ago
- This repository provides usage examples for the Python module Newspaper3k.☆147Updated last year
- Guess gender from first name in Python 2 and 3☆133Updated 2 years ago
- A Python library for generating word tree diagrams☆25Updated 4 years ago
- An automated, programming-free web scraper for interactive sites☆110Updated last year
- A spaCy wrapper for DBpedia Spotlight☆109Updated 2 years ago
- searching large heterogenous data dumps with Universal Sentence Encoder☆62Updated 3 years ago
- A Python scraper for the Facebook Ad Library, using the official Facebook Ad Library API.☆119Updated 5 years ago
- Package for performing Reddit-based text analysis☆22Updated 6 years ago