rgriffogoes / scraper-notebookLinks
Jupyter Docker stack image with pre-installer scraper tools and libraries
☆29Updated 3 years ago
Alternatives and similar repositories for scraper-notebook
Users that are interested in scraper-notebook are comparing it to the libraries listed below
Sorting:
- a python library for accessing the ClickUp api☆75Updated last year
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆55Updated this week
- A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)☆123Updated 2 weeks ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆328Updated 2 years ago
- Simple bash script to shorten URLs with YOURLS☆13Updated 4 years ago
- ☆23Updated 3 years ago
- 💡✏️️ ⬇️️ JSON to Markdown converter - Generate Markdown from format independent JSON☆78Updated 6 years ago
- Hook toolkit for Paperless-ngx with a REST API client in written Go☆13Updated last week
- A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.☆41Updated 7 years ago
- Top 18K of GitHub's finest.☆68Updated this week
- Parse markdown article, download images and replace images URL's with local paths☆125Updated 3 weeks ago
- Telegram > OpenAI > Read Later [instapaper/pocket/omnivore]☆16Updated 2 years ago
- Generate a list of your GitHub stars by topic - automatically!☆103Updated 3 years ago
- Web page archive tool☆26Updated 4 months ago
- A collection of PDF command line tools and wrappers for Linux☆117Updated 2 years ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆41Updated 2 years ago
- Unofficial Otter.ai Python API☆82Updated 2 months ago
- 📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs☆70Updated last year
- Marp Editor for @standardnotes. Create presentations with Marp and Marpit Markdown | https://marpeditor.com☆34Updated 5 years ago
- The GitBook documentation site for OpenAlex☆26Updated 3 weeks ago
- Awesome list dedicated to digital and data preservation tools, sources, services and so on.☆31Updated 3 weeks ago
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆158Updated last month
- Tool to OCR PDFs using Google Cloud Vision☆42Updated 3 years ago
- Python Wrapper on top of Unofficial Medium API to quickly extract data from Medium's website.☆61Updated 6 months ago
- ScrapingAnt API client for Python.☆43Updated last year
- This is a proof-of-concept of using an LLM to find and extract meaningful data without parsing the html too much.☆30Updated 2 years ago
- Simple wrapper script for Joplin CLI, for those that want to speedily make a note from terminal☆23Updated 7 years ago
- NocoDB Python API Client☆76Updated 2 years ago
- SingleFile docker implementation providing access via CLI and WEB service☆53Updated last year
- Abbreviations for use with the Abbreviation Filter developed for use with Multilingual Zotero.☆18Updated 2 years ago