nytlabs / pageinfo
Python module for extracting information from web pages
☆41Updated 10 years ago
Alternatives and similar repositories for pageinfo:
Users that are interested in pageinfo are comparing it to the libraries listed below
- Helper methods for generating text that conforms to "The New York Times Manual of Style and Usage"☆27Updated 10 years ago
- Know more with less☆50Updated 10 years ago
- JSON export and uploading extension for Google Refine☆29Updated 13 years ago
- a set of services that provide NLP facilities☆25Updated 4 years ago
- Data Pipes for CSV☆116Updated 2 years ago
- python-readability, but faster (mirror-ish)☆84Updated 13 years ago
- A Python version (almost a port) of ProPublica's TableFu☆233Updated 11 years ago
- A Python module to access Pinboard.in via its API. This is a fork/modification of mudge/python-delicious☆168Updated 10 years ago
- A simple transformation/data processing pipeline for CrisisNET☆15Updated 10 years ago
- A command-line and programmatic interface to various social sharecount endpoints.☆30Updated 6 years ago
- Little JSON object want to be graphs, too!☆17Updated 9 years ago
- A reverse part-of-speech tagger. Give it a list of tags and it spews out matching language.☆23Updated 10 years ago
- Ultra simple API for geocoding a single string against various web services.☆183Updated 11 years ago
- A statistics extension for Google Refine.☆33Updated 13 years ago
- TweeQL is a Query Language for Tweets: SELECT brand(text) AS brand, sentiment(text) AS sentiment FROM twitter_sample;☆193Updated 10 years ago
- Open-source fork of code behind http://everyblock.com/☆99Updated 12 years ago
- Easy link blogging to a Github-hosted Jekyll blog☆12Updated 8 years ago
- [unmaintained] Python version of arc90's *older* readability.js☆47Updated 13 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- This a module to extract RDF from an HTML5 page annotated with microdata. The module implements the algorithm defined and published by th…☆44Updated 2 years ago
- A tool for bulk text comparison and analysis☆119Updated 11 years ago
- PANDA: A Newsroom Data Appliance☆205Updated 2 years ago
- Like Tabletop.js — but for Google Docs!☆66Updated 8 years ago
- Turns legal citations in the DOM into links☆20Updated 8 years ago
- backchan.nl is a tool for involving audiences in presentations by letting them suggest questions and vote on each other's questions.☆30Updated 12 years ago
- A server for your markdown files. Give it a directory, and Commonplace gives you a url, pretty pages, and quick editing.☆170Updated 4 years ago
- NPR Visual's Carebot (deprecated, now in: https://github.com/thecarebot/carebot)☆15Updated 9 years ago
- An interactive infographic showing how HBO loves to reuse actors.☆36Updated last year
- You keep personal data in all sorts of places on the internets. Jellyroll brings them together onto your own site.☆132Updated 13 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago