copiesofcopies / youtube-transcription
☆72Updated 11 years ago
Related projects: ⓘ
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- December 14th Python Meetup Files☆37Updated 11 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 3 weeks ago
- a set of services that provide NLP facilities☆25Updated 3 years ago
- Set of scripts to aid in the download of the GDELT data files from www.gdeltproject.org☆11Updated 10 years ago
- Jupyter notebook + Code for reproducing Reddit Subreddit graphs☆16Updated 8 years ago
- Scraping Assisted by Learning☆35Updated last week
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- ☆23Updated this week
- A web application that identifies party in political discourse and an example of operationalized machine learning.☆27Updated 6 years ago
- Scrapy project with spiders to extract article content from various german news sites☆21Updated 11 years ago
- Pollster polls for share counts of URLs at regular intervals.☆47Updated 8 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 2 years ago
- ☆29Updated this week
- A space for code and projects around analysing news content☆23Updated 6 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Scrapes sites. Gets news. Eventually events.☆80Updated 8 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆90Updated 2 years ago
- Find which links on a web page are pagination links☆29Updated 7 years ago
- An implementation of the multi-armed bandit optimization pattern as a Flask extension☆80Updated this week
- R code needed to reproduce Relationship between Reddit Comment Score and Comment Length for 1.66 Billion Comments visualization☆17Updated 9 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Demo code for learning_text_transformer☆25Updated 9 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 7 months ago
- Multidimensional data explorer and visualization tool.☆52Updated 7 years ago
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 2 years ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- Concept discovery and recommendation library built on top of the IBM Watson cognitive API.☆24Updated 7 years ago