ContentMine / quickscrape
A scraping command line tool for the modern web
☆260Updated 8 years ago
Alternatives and similar repositories for quickscrape:
Users that are interested in quickscrape are comparing it to the libraries listed below
- Journal scraper definitions for the ContentMine framework☆66Updated 6 years ago
- Get metadata, fulltexts or fulltext URLs of papers matching a search query☆197Updated 4 years ago
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆170Updated 4 years ago
- View, visualize, clean and process data in the browser.☆148Updated 6 years ago
- We introduce TACIT: An Open-Source Text Analysis, Crawling and Interpretation Tool. TACIT's plugin architecture has three main components…☆107Updated 6 years ago
- Convert XML/SVG/PDF into normalised, sectioned, scholarly HTML☆37Updated last year
- An online annotation platform for teaching and learning in the humanities.☆107Updated 2 months ago
- Data Pipes for CSV☆116Updated 2 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated last year
- Python scripts for interacting with the hypothes.is API☆48Updated 7 years ago
- Documentation and project-wide issues for the Website Monitoring project (a.k.a. "Scanner")☆108Updated 2 months ago
- A full-stack publishing solution involving different technologies to power digital archives☆157Updated 4 years ago
- One-Click User Instigated Preservation☆126Updated 6 years ago
- Palladio Application☆41Updated 3 years ago
- BibServer is open-source software what makes it easy to publish, manage and find bibliographies. BibServer is RESTful and web-friendly.☆126Updated 6 years ago
- A network graph exploration tool☆63Updated 2 years ago
- Superfeedr powered pipes!☆131Updated 9 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆225Updated 2 years ago
- Convert an XML input to a JSON output, using xml-mapping☆162Updated 8 years ago
- MarkItDown: retro-convert rich text to Markdown☆70Updated 12 years ago
- tool for collectively summarizing large discussions☆143Updated 2 years ago
- WARC and ARC indexing and discovery tools.☆123Updated last month
- A simple OpenRefine reconciliation service that runs on top of a CSV file☆120Updated 9 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆86Updated 8 years ago
- Facilitating the global conversation on academic literature☆266Updated 7 years ago
- Data Store for Annotation Studio☆46Updated 2 years ago
- Create a git repository from the revision history of a document in Google Drive.☆134Updated 7 years ago
- Highlight and select phrases in HTML pages.☆24Updated 5 years ago
- Social Feed Manager user interface application.☆155Updated 10 months ago
- Open source large document set visualization platform☆268Updated 2 years ago