nytlabs / pageinfo
Python module for extracting information from web pages
☆41Updated 10 years ago
Alternatives and similar repositories for pageinfo:
Users that are interested in pageinfo are comparing it to the libraries listed below
- Helper methods for generating text that conforms to "The New York Times Manual of Style and Usage"☆27Updated 10 years ago
- a set of services that provide NLP facilities☆25Updated 4 years ago
- Know more with less☆50Updated 10 years ago
- A library for accessing a spreadsheet as a native Python object suitable for templating.☆225Updated 6 years ago
- Like Tabletop.js — but for Google Docs!☆66Updated 8 years ago
- An attempt at creating a silver/gold standard dataset for backtesting yesterday & today's content-extractors☆34Updated 10 years ago
- A reverse part-of-speech tagger. Give it a list of tags and it spews out matching language.☆23Updated 10 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 10 years ago
- JSON export and uploading extension for Google Refine☆29Updated 13 years ago
- A simple transformation/data processing pipeline for CrisisNET☆15Updated 10 years ago
- Data Pipes for CSV☆116Updated 2 years ago
- Publish spreadsheets as interactive tables. And do it on deadline.☆74Updated 8 years ago
- Whippersnapper is an automated screenshot tool to keep a visual history of content on the web.☆55Updated 9 years ago
- A Python version (almost a port) of ProPublica's TableFu☆231Updated 11 years ago
- Chrome extension that highlights anonymous sources in news articles☆33Updated 8 years ago
- A handy template for building a django prep sports site.☆14Updated 13 years ago
- Tools for working with Optical Character Recognition output☆16Updated 11 years ago
- ☆36Updated 7 years ago
- A polite, minimal interface for sending python objects to and from Amazon S3.☆57Updated 9 years ago
- A ready-to-deploy system for aggregating regional boundary data (from shapefiles) and republishing that data via a RESTful JSON API.☆82Updated 3 years ago
- presentation for nicar 2011 (an exploration into the concepts behind backbone.js)☆12Updated 14 years ago
- A utility for spreadsheet-style handling of arrays (e.g. filtering, formatting, and sorting)☆36Updated 14 years ago
- a web based tool to monitor how your website content is used in wikipedia☆37Updated 4 years ago
- moxie☆28Updated 9 years ago
- Bash-style pipelining for Python generators.☆17Updated 14 years ago
- Content and Ideas for the MediaPublic.io website.☆32Updated 8 years ago
- Ultra simple API for geocoding a single string against various web services.☆183Updated 11 years ago
- RiTaJS: A generative language toolkit for JavaScript☆43Updated 4 years ago
- Analysis for a blog post on cartograms.☆29Updated 10 months ago
- A new version of the software used in the Cluetrain listicle☆19Updated 10 years ago