paulpierre / markdown-crawlerLinks
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file for each page, designed for LLM RAG
β399Updated last year
Alternatives and similar repositories for markdown-crawler
Users that are interested in markdown-crawler are comparing it to the libraries listed below
Sorting:
- Parse PDFs into markdown using Vision LLMsβ417Updated 6 months ago
- HTML to Markdown converter and crawler.β588Updated last year
- Yet another open source Perplexityβ449Updated 10 months ago
- 90% of what you need for LLM app development. Nothing you don't.β265Updated this week
- Visualize Different Text Splitting Methodsβ285Updated 7 months ago
- SearchGPT / Perplexity Pages clone, but personalised for you.β244Updated last year
- π This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simpβ¦β244Updated last month
- Structured information extraction from documentsβ317Updated 11 months ago
- Clone of https://r.jina.ai which is deployable locallyβ48Updated 11 months ago
- β‘Chat with GitHub Repo Using 200k context window of Claude instead of RAG!β‘β168Updated last year
- Easily deployable π API to convert PDF to markdown quickly with high accuracy.β889Updated 10 months ago
- Excel spreadsheet crawler and table parser for data extraction and queryingβ153Updated 5 months ago
- β234Updated 2 months ago
- A Function Calls Proxy for Groq, the fastest AI alive!β202Updated last year
- ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3β494Updated 2 weeks ago
- Local semantic search. Stupidly simple.β434Updated last year
- A simple Python program to implement the search-extract-summarize flow.β270Updated 2 months ago
- β149Updated last year
- Your first AI prompt engineerβ406Updated last month
- Connect and chat with your multiple documents (pdf and txt) through GPT 3.5, GPT-4 Turbo, Claude and Local Open-Source LLMsβ798Updated 6 months ago
- Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']β475Updated 7 months ago
- β89Updated last year
- A cool AI Diagram generator from a given topic, that streams the partial diagrams from the incomplete JSONs during generation. Built usinβ¦β213Updated last year
- No-code ETL and data pipelines with AI and NLPβ315Updated 6 months ago
- For LLMs to better code with Jina APIβ165Updated last month
- An experimental UI for text-to-knowledge-graph generationβ778Updated last year
- This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing dβ¦β146Updated last year
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.β382Updated last year
- β215Updated last year
- Use LLMs to draw concept maps from web pages.β96Updated last year