A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file for each page, designed for LLM RAG
β438Aug 13, 2024Updated last year
Alternatives and similar repositories for markdown-crawler
Users that are interested in markdown-crawler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple and streamlined Python script to extract and filter links from a remote HTML resource.β24Jan 12, 2025Updated last year
- Browser automation for creating new pages in WordPressβ13Jun 7, 2025Updated 10 months ago
- A fast tool to convert any website into LLM-ready markdown data. Built by https://supermemory.aiβ1,899Jul 21, 2024Updated last year
- An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate hβ¦β22Nov 21, 2025Updated 4 months ago
- A tool to automatically create and run your Python scripts in a virtual environment with installed dependenciesβ19Apr 9, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Empower your script with auto_venv: Say Goodbye to Manual Setup or Install!β21Jun 20, 2024Updated last year
- Cookbook for Crafting Good Codeβ57Mar 19, 2024Updated 2 years ago
- word4num is a versatile tool for encoding numbers into words, applicable for geolocation, phone numbers, postcodes, IPv4 addresses, and mβ¦β12Oct 9, 2024Updated last year
- Plugin for Obsidian.md β Thesaurus, dictionary and more using the Datamuse APIβ54May 22, 2024Updated last year
- Creating Intelligent Terminal Apps with ChatGPT and LLMΒ Modelsβ30Jul 9, 2023Updated 2 years ago
- Search, modify, and parse messy HTML with ease.β41Jan 17, 2026Updated 2 months ago
- ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The boβ¦β87Feb 17, 2024Updated 2 years ago
- Containerized workflow automation toolβ22Apr 7, 2026Updated last week
- β12Nov 5, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Wave Partial Differential Equation Solver in Pythonβ14Jun 5, 2024Updated last year
- Extract structured text from pdfs quicklyβ683Jun 11, 2025Updated 10 months ago
- Examples of vector DB indexing and query with various vector databases.β13Feb 12, 2025Updated last year
- An Obsidian plugin to set the Link Text using the document titleβ22Jan 3, 2026Updated 3 months ago
- This is a Telegram Bot π€ using Flowise API call giving a lot of posibilities with langchain tecnology.β23Jun 27, 2024Updated last year
- Use NavamAI to supercharge your productivity and workflow with personal, fast, and quality AI. Turn your Terminal into a configurable, inβ¦β26Oct 15, 2024Updated last year
- A modern shellβ355Nov 14, 2025Updated 5 months ago
- Incredibly descriptive audiovisual summaries for videosβ41Aug 2, 2024Updated last year
- Rust implementation of Suryaβ66Mar 1, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Python scraper based on AIβ23,305Apr 9, 2026Updated last week
- Convert HTML to Markdownβ2,143Nov 16, 2025Updated 5 months ago
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/β10,518May 8, 2025Updated 11 months ago
- FinGPT is an AI language model designed to understand and generate financial content. Built upon the GPT (Generative Pre-trained Transforβ¦β12Nov 14, 2025Updated 5 months ago
- Obsidian plugin to toggle between `lowercase` `UPPERCASE` and `Title Case`β10Sep 10, 2024Updated last year
- Natural language browser automationβ630Dec 21, 2024Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.β69May 9, 2023Updated 2 years ago
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ63,955Updated this week
- Personal AI search copilot, open-source Perplexityβ784Aug 7, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Awesome list of awesome website from my bookmarks. Download bookmarks also.β11Jul 29, 2023Updated 2 years ago
- π₯ The Web Data API for AI - Power AI agents with clean web dataβ107,713Updated this week
- Add Google and Python documentation links to the bottom of exceptions.β28Nov 4, 2023Updated 2 years ago
- HTML to Markdown converter and crawler.β617Jan 9, 2024Updated 2 years ago
- Turn any webpage into structured data using LLMsβ6,258Apr 6, 2026Updated last week
- Simple frontend for Google Custom Search Engineβ13Apr 17, 2024Updated last year
- Integrated LLM-based document and data Q&A with knowledge graph visualizationβ24Dec 9, 2023Updated 2 years ago