ContinuumIO / scrapy_scrapers
Scraper built with Scrapy.
☆17Updated 7 months ago
Alternatives and similar repositories for scrapy_scrapers:
Users that are interested in scrapy_scrapers are comparing it to the libraries listed below
- An online reference for data journalism☆25Updated 11 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 9 years ago
- ☆21Updated 9 years ago
- Topic modeling web application☆40Updated 9 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 9 years ago
- [DEPRECATED] Please use https://github.com/frictionlessdata/specs☆17Updated 7 years ago
- General Architecture for Text Engineering☆49Updated 9 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 8 years ago
- JSON schemas for OpenCorporates data☆20Updated 10 months ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 8 years ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- c-span opened captions node buffer server + google docs apps script☆8Updated 5 years ago
- Python and pandas tools to perform various analyses on different types of word lists☆16Updated 10 years ago
- Tools for working with Optical Character Recognition output☆16Updated 11 years ago
- Tool to cleanse and semantify datasets from CKAN repositories. Based on OpenRefine.☆23Updated 9 years ago
- Vizlinc☆14Updated 9 years ago
- Open Knowledge coding standards and style guide.☆35Updated 5 years ago
- A project to demonstrate maximum entropy models for extracting quotes from news articles in Python.☆25Updated 12 years ago
- Examples of bad data, especially from government.☆23Updated 8 months ago
- Browser add-on and web server to support collection and analysis of web browsing data.☆13Updated 9 years ago
- A pastebin for tables.☆34Updated 11 years ago
- Whit is an open source SMS service, which allows you to query CrunchBase, Wikipedia, and several other data APIs.☆199Updated 11 years ago
- ☆13Updated 9 years ago
- Big GeoSpatial Data Points Visualization Tool☆19Updated 8 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 4 years ago
- ☆20Updated 8 years ago
- Pattern-of-Behavior Search Tool☆11Updated 2 years ago
- Manage and load dataprotocols.org Data Packages☆27Updated 9 years ago
- Data notification service: subscribe to keywords and get notified whenever an open data sources mentions that keyword.☆24Updated 11 years ago