paulpierre / markdown-crawler
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file for each page, designed for LLM RAG
β311Updated 3 months ago
Related projects β
Alternatives and complementary repositories for markdown-crawler
- β182Updated this week
- Visualize Different Text Splitting Methodsβ199Updated this week
- β111Updated 4 months ago
- Extract structured text from pdfs quicklyβ342Updated this week
- SearchGPT / Perplexity Pages clone, but personalised for you.β219Updated 2 months ago
- HTML to Markdown converter and crawler.β492Updated 10 months ago
- π This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simpβ¦β58Updated last month
- Prompt optimization scratchβ436Updated this week
- Yet another open source Perplexityβ372Updated last month
- β80Updated 10 months ago
- An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.β265Updated 2 weeks ago
- The full experience of chatting with your favourite news website.β109Updated 11 months ago
- 90% of what you need for LLM app development. Nothing you don't.β79Updated this week
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.β342Updated 6 months ago
- Awesome Devin-inspired AI agentsβ145Updated 6 months ago
- Improve your questions! The AI for Inquiry - QuestionImprover Agent is an LLM-driven βtool for thoughtβ designed to enhance the depth andβ¦β138Updated last month
- β240Updated last year
- β253Updated 5 months ago
- Your first AI prompt engineerβ343Updated 2 weeks ago
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.β279Updated this week
- Open-source RAG evaluation through users' feedbackβ161Updated 7 months ago
- Structured information extraction from documentsβ283Updated last month
- openperplex is an opensource AI search engineβ160Updated 3 months ago
- Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)β262Updated 8 months ago
- A memory framework for Large Language Models and Agents.β162Updated 3 months ago
- Excel spreadsheet crawler and table parser for data extraction and queryingβ116Updated last month
- openperplex is an opensource AI search engineβ762Updated 3 months ago
- This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing dβ¦β137Updated 7 months ago
- Multi-agent that helps you organize and write documents.β283Updated last week
- β251Updated 3 months ago