nytlabs / pageinfoLinks
Python module for extracting information from web pages
☆41Updated 11 years ago
Alternatives and similar repositories for pageinfo
Users that are interested in pageinfo are comparing it to the libraries listed below
Sorting:
- Helper methods for generating text that conforms to "The New York Times Manual of Style and Usage"☆27Updated 11 years ago
- a set of services that provide NLP facilities☆25Updated 4 years ago
- A library for accessing a spreadsheet as a native Python object suitable for templating.☆225Updated 7 years ago
- A reverse part-of-speech tagger. Give it a list of tags and it spews out matching language.☆23Updated 10 years ago
- Know more with less☆50Updated 10 years ago
- A simple transformation/data processing pipeline for CrisisNET☆15Updated 10 years ago
- TweeQL is a Query Language for Tweets: SELECT brand(text) AS brand, sentiment(text) AS sentiment FROM twitter_sample;☆193Updated 11 years ago
- A Python version (almost a port) of ProPublica's TableFu☆231Updated 11 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 9 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 10 years ago
- A polite, minimal interface for sending python objects to and from Amazon S3.☆57Updated 9 years ago
- A command line utility for generating Google Analytics reports that are straightforward to compare across domains, projects or pages.☆41Updated 4 years ago
- Bash-style pipelining for Python generators.☆17Updated 14 years ago
- Little JSON object want to be graphs, too!☆17Updated 9 years ago
- Code for Newslynx App☆22Updated 9 years ago
- Publish spreadsheets as interactive tables. And do it on deadline.☆74Updated 8 years ago
- Utilities for working with data.☆20Updated 10 years ago
- Neddick: Open Source Information Discovery Platform☆36Updated 2 years ago
- Modularly extensible semantic metadata validator☆84Updated 9 years ago
- An attempt at creating a silver/gold standard dataset for backtesting yesterday & today's content-extractors☆35Updated 10 years ago
- An AIML alternative, YAML based. Aerolito works like a simulation of natural language processing.☆20Updated 13 years ago
- LoadKit supports Extract, Transform, Load processes based on ArchiveKit buckets.☆11Updated 10 years ago
- A new version of the software used in the Cluetrain listicle☆19Updated 10 years ago
- A statistics extension for Google Refine.☆33Updated 13 years ago
- Chrome extension that highlights anonymous sources in news articles☆33Updated 8 years ago
- Compile Yahoo! Pipes to Javascript (Node.js)☆44Updated 12 years ago
- A lightweight Python framework for building cli-inspired Slack bots.☆71Updated 2 years ago
- A ready-to-deploy system for aggregating regional boundary data (from shapefiles) and republishing that data via a RESTful JSON API.☆82Updated 3 years ago
- Personal nouns assembled from the 1890 Webster's Unabridged Dictionary.☆9Updated 8 years ago
- A command-line and programmatic interface to various social sharecount endpoints.☆30Updated 6 years ago