rgriffogoes / scraper-notebook
Jupyter Docker stack image with pre-installer scraper tools and libraries
☆26Updated 2 years ago
Alternatives and similar repositories for scraper-notebook:
Users that are interested in scraper-notebook are comparing it to the libraries listed below
- ☆25Updated 4 years ago
- Simple wrapper script for Joplin CLI, for those that want to speedily make a note from terminal☆23Updated 6 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆55Updated 2 months ago
- Generating ebooks in MOBI end EPUB formats from articles you saved on Pocket account or which showed up in your favourite RSS feeds.☆12Updated 3 years ago
- Crawl a website to generate knowledge file for RAG☆33Updated 6 months ago
- You can use this act to monitor any page's content and get a notification when content changes.☆19Updated 2 years ago
- A Python3, async interface to the linkding REST API☆20Updated 2 weeks ago
- Telegram > OpenAI > Read Later [instapaper/pocket/omnivore]☆17Updated last year
- LLM plugin for embeddings using sentence-transformers☆48Updated last week
- A News Article Collection Library☆22Updated last year
- A financial disclosure data extraction tool.☆13Updated last year
- Scrape HN to track links from specific domains☆52Updated this week
- A CLI to interface with an instance of linkding☆31Updated this week
- KonMari your Pocket tsundoku from the command line☆16Updated last year
- A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.☆24Updated 6 years ago
- advertools visualizations☆18Updated 7 months ago
- A CLI for the API of LinkAce (https://github.com/Kovah/LinkAce)☆10Updated last month
- Scripts and ideas to manage tons and tons of images and movies☆16Updated last week
- Spider templates for automatic crawlers.☆27Updated 2 weeks ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 4 months ago
- List of tools for dealing with the wonderful PDF format.☆48Updated 4 years ago
- Tool to index and serve HTML files. Powered by Datasette.☆95Updated 2 years ago
- this is a collections of good docker contaienrs iv collected over the time , if you have portainer installed i have an app template that …☆13Updated last year
- Add website scraping abilities to Datasette☆62Updated last year
- Scrape and parse Google search results in Python☆32Updated last year
- 📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs☆66Updated 9 months ago
- Reads HTML files, converting tables into CSV files☆31Updated 4 years ago
- 📑 Python Package to reconstruct the original continuous text from PDFs with language models☆32Updated last year
- This is the HeadQuarters of my digital info. HPI library got me inspired and I'm trying to play with the idea on a smaller scale for myse…☆20Updated last year
- A utility that extracts tables from HTML documents and converts them to CSV format☆42Updated last year