A data processing pipeline that schedules and runs content harvesters, normalizes their data, and outputs that normalized data to a variety of output streams. This is part of the SHARE project, and will be used to create a free and open dataset of research (meta)data. Data collected can be explored at https://osf.io/share/, and viewed at https:/…
☆42Jun 22, 2016Updated 9 years ago
Alternatives and similar repositories for scrapi
Users that are interested in scrapi are comparing it to the libraries listed below
Sorting:
- Using social media to steer web archiving and curation.☆18Nov 20, 2015Updated 10 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Sep 14, 2016Updated 9 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15May 2, 2015Updated 10 years ago
- mltk - Moz Language Tool Kit☆12Mar 6, 2015Updated 10 years ago
- A contextual news development environment.☆49Dec 19, 2014Updated 11 years ago
- Place Pulse code repository☆15Mar 6, 2013Updated 12 years ago
- ☆21Jan 23, 2016Updated 10 years ago
- Discover, analyze and present data from the web and mobile in meaninful ways☆83Jul 16, 2013Updated 12 years ago
- Navigating the sea of publications☆13Jan 3, 2016Updated 10 years ago
- Manage and load dataprotocols.org Data Packages☆27Sep 17, 2015Updated 10 years ago
- ☆10Apr 26, 2016Updated 9 years ago
- Newsclipse: The IDE for news production.☆91Dec 11, 2014Updated 11 years ago
- A command line and Python client for Open-Spending☆10Nov 24, 2017Updated 8 years ago
- Collaborative Innovation Class Project☆14Jun 12, 2015Updated 10 years ago
- ☆13Jul 2, 2017Updated 8 years ago
- Crowd Based Coding and Harmonization using Linked Data☆12Jan 22, 2018Updated 8 years ago
- Basic linked data fragments endpoint.☆15Apr 20, 2017Updated 8 years ago
- Collects multimedia content shared through social networks.☆19Feb 18, 2015Updated 11 years ago
- ☆23Mar 7, 2015Updated 10 years ago
- Linked Data tools for SMEs☆16Oct 3, 2016Updated 9 years ago
- Contains the implementation of algorithms that estimate the geographic location of media content based on their content and metadata. It …☆15Oct 15, 2016Updated 9 years ago
- A crawler, indexer, and query interface all in Python with distributed processing via Pyro4.☆23Mar 16, 2012Updated 13 years ago
- ☆26Feb 18, 2022Updated 4 years ago
- Python ETL and Data Warehouse☆34Oct 5, 2015Updated 10 years ago
- SDS 385: Statistical Models for Big Data☆17Oct 25, 2017Updated 8 years ago
- Re-usable wrapper scripts for text document extractors.☆37Jun 18, 2016Updated 9 years ago
- Software for preprocessing textual data in multiple languages for textual analysis.☆23Feb 28, 2016Updated 10 years ago
- Little JSON object want to be graphs, too!☆17Oct 2, 2015Updated 10 years ago
- Boilerplate to help speed up d3.js development☆86Feb 23, 2013Updated 13 years ago
- Tracking events around scholarly content☆104Dec 14, 2022Updated 3 years ago
- Want to learn more about Free Law Project technologies, policies and thinking? Get the literature here.☆25Jul 6, 2021Updated 4 years ago
- KnowledgeStore☆21Feb 1, 2018Updated 8 years ago
- Pure python script that takes user query and summarizes news related to it.☆25Jul 6, 2022Updated 3 years ago
- Introduction to GitHub for Collaborating and Academic Publishing☆39Mar 30, 2019Updated 6 years ago
- Pratt Institute course on open data☆20May 8, 2017Updated 8 years ago
- ☆25Mar 28, 2019Updated 6 years ago
- ☆20Aug 8, 2015Updated 10 years ago
- Python notebooks analyzing campaign finance and lobbying activity data from California Secretary of State’s CAL-ACCESS database☆22Mar 3, 2018Updated 8 years ago
- repository for development of visuals in accordance with the sunlight style guide☆27Jul 8, 2015Updated 10 years ago