paulpierre / markdown-crawler
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file for each page, designed for LLM RAG
β358Updated 6 months ago
Alternatives and similar repositories for markdown-crawler:
Users that are interested in markdown-crawler are comparing it to the libraries listed below
- Extract structured text from pdfs quicklyβ418Updated this week
- HTML to Markdown converter and crawler.β522Updated last year
- β207Updated 2 months ago
- Parse PDFs into markdown using Vision LLMsβ273Updated 2 weeks ago
- Visualize Different Text Splitting Methodsβ223Updated last month
- An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.β279Updated this week
- Yet another open source Perplexityβ423Updated 4 months ago
- β648Updated last week
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.β356Updated 9 months ago
- β173Updated this week
- Improve your questions! The AI for Inquiry - QuestionImprover Agent is an LLM-driven βtool for thoughtβ designed to enhance the depth andβ¦β141Updated this week
- SearchGPT / Perplexity Pages clone, but personalised for you.β235Updated 5 months ago
- Structured information extraction from documentsβ306Updated 4 months ago
- An innovative open-source Code Interpreter with (GPT,Gemini,Claude,LLaMa) models.β248Updated last week
- Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)β262Updated 11 months ago
- Human-AI collaboration to produce a newstory about a meeting from minutes or transcriptβ179Updated 2 months ago
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI appβ1,485Updated this week
- Generic rag framework to apply the power of LLMs on any given datasetβ525Updated this week
- Action library for AI Agentβ209Updated this week
- A simple Python sandbox for helpful LLM data agentsβ222Updated 8 months ago
- β265Updated 6 months ago
- A Python library powered by Language Models (LLMs) for conversational data discovery and analysis.β521Updated this week
- A simple Python program to implement the search-extract-summarize flow.β254Updated last month
- 90% of what you need for LLM app development. Nothing you don't.β242Updated this week
- Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']β451Updated last month
- Detect and extract tables to markdown and csvβ726Updated 3 weeks ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β244Updated this week
- Prompt optimization scratchβ619Updated last week
- β385Updated 3 months ago
- RESTai is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex & Langchain. Supports any public LLM supported by Lβ¦β411Updated this week