DBeath / feedsearch-crawlerLinks
Crawl sites for RSS, Atom, and JSON feeds.
☆79Updated last month
Alternatives and similar repositories for feedsearch-crawler
Users that are interested in feedsearch-crawler are comparing it to the libraries listed below
Sorting:
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 11 months ago
- LLM plugin for embeddings using sentence-transformers☆71Updated 4 months ago
- A Collection of Awesome Personal Search Engines and Related Projects☆19Updated 2 years ago
- Add website scraping abilities to Datasette☆64Updated 2 years ago
- A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.☆37Updated 6 years ago
- Generate a list of your GitHub stars by topic - automatically!☆83Updated 2 years ago
- A helper library full of URL-related heuristics.☆70Updated 2 weeks ago
- Yet another tool to search through your (exported) ChatGPT conversations☆12Updated 11 months ago
- API interface to the Raindrop Bookmark Manager.☆43Updated 2 months ago
- Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more☆20Updated 6 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 10 months ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆50Updated this week
- Presentations on Quantified Self and Self-Tracking with Python☆31Updated 2 years ago
- This is the HeadQuarters of my digital info. HPI library got me inspired and I'm trying to play with the idea on a smaller scale for myse…☆21Updated last year
- Datasette plugin for rendering HTML based on JSON values☆28Updated 3 years ago
- Semanlink is a personal information management system based on RDF. It lets you add tags, as well as other RDF metadata, to files, bookma…☆18Updated 8 months ago
- Gets your upvoted posts from Hacker News and imports them to raindrop.io☆26Updated 2 years ago
- Scrape HN to track links from specific domains☆63Updated this week
- Python Module to use the Readwise API☆19Updated last week
- Spider templates for automatic crawlers.☆31Updated 2 months ago
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆25Updated 5 years ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆40Updated last year
- Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.☆105Updated 7 years ago
- A natural language date parser. (Python version of chrono.js)☆25Updated 3 months ago
- A Python utility for moving bookmarks/reading lists between services☆205Updated 9 years ago
- iOS Safari Extension to convert web pages to Markdown text☆38Updated 2 years ago
- A set of scripts that connect various apps to Raindrop.io☆19Updated 5 months ago
- 💡✏️️ ⬇️️ JSON to Markdown converter - Generate Markdown from format independent JSON☆74Updated 6 years ago
- Tools to easy generate RSS feed that contains each scraped item using Scrapy framework.☆33Updated last week
- backup and parse your browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)☆143Updated last week