rgriffogoes / scraper-notebookLinks
Jupyter Docker stack image with pre-installer scraper tools and libraries
β26Updated 3 years ago
Alternatives and similar repositories for scraper-notebook
Users that are interested in scraper-notebook are comparing it to the libraries listed below
Sorting:
- a python library for accessing the ClickUp apiβ75Updated last year
- π Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFsβ70Updated last year
- Hook toolkit for Paperless-ngx with a REST API client in written Goβ13Updated 2 weeks ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.β50Updated last week
- Backup easily your system with Bitwarden, BorgBase and Dockerβ19Updated 4 months ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.β38Updated last year
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each pageβ¦β40Updated last year
- A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.β39Updated 6 years ago
- Python Module to use the Readwise APIβ19Updated last week
- A financial disclosure data extraction tool.β18Updated 2 years ago
- Daily TV News Summary using GPTβ23Updated 5 months ago
- A microservice for document conversion at scaleβ79Updated this week
- A collection of PDF command line tools and wrappers for Linuxβ111Updated 2 years ago
- β21Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β63Updated this week
- β21Updated 3 months ago
- A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)β111Updated last month
- ReadablePDF streamlines the effort of turning a not so great PDF into a more easily readable PDF (or of course a pretty decent PDF into aβ¦β33Updated 4 years ago
- ScrapingAnt API client for Python.β43Updated last year
- Tool to index and serve HTML files. Powered by Datasette.β107Updated 3 years ago
- Heal your Markdown files: convert to outline, list tasks and more tools to comeβ52Updated last week
- Automatically sync Omnivore pages to Raindrop.ioβ21Updated 10 months ago
- Marp Editor for @standardnotes. Create presentations with Marp and Marpit Markdown | https://marpeditor.comβ34Updated 4 years ago
- β14Updated 3 months ago
- Terraform project that deploys VSCode Server on Oracle Cloud Infrastructure (free tier) and protect the access with Cloudflare Zero Trustβ¦β27Updated 3 weeks ago
- SingleFile docker implementation providing access via CLI and WEB serviceβ49Updated last year
- Scrape various open data directories to create an index of what's available out thereβ37Updated 8 months ago
- Web page archive toolβ27Updated last month
- Unofficial Otter.ai Python APIβ74Updated last year
- Public Neo4j Knowledge Baseβ23Updated 2 months ago