paulpierre / markdown-crawlerLinks
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file for each page, designed for LLM RAG
β391Updated 10 months ago
Alternatives and similar repositories for markdown-crawler
Users that are interested in markdown-crawler are comparing it to the libraries listed below
Sorting:
- Parse PDFs into markdown using Vision LLMsβ393Updated 4 months ago
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard thoughβ553Updated 3 weeks ago
- HTML to Markdown converter and crawler.β571Updated last year
- Structured information extraction from documentsβ315Updated 9 months ago
- SearchGPT / Perplexity Pages clone, but personalised for you.β243Updated 9 months ago
- π₯€ RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQLβ1,021Updated last week
- β268Updated last year
- Extract structured text from pdfs quicklyβ497Updated 2 weeks ago
- A cool AI Diagram generator from a given topic, that streams the partial diagrams from the incomplete JSONs during generation. Built usinβ¦β211Updated last year
- Yet another open source Perplexityβ447Updated 8 months ago
- Generic rag framework to apply the power of LLMs on any given datasetβ629Updated last week
- openperplex is an opensource AI search engineβ864Updated 10 months ago
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.β377Updated last year
- The simplest open-source implementation of perplexity.aiβ313Updated 5 months ago
- Visualize Different Text Splitting Methodsβ263Updated 5 months ago
- Easily deployable π API to convert PDF to markdown quickly with high accuracy.β870Updated 8 months ago
- Detect and extract tables to markdown and csvβ749Updated 5 months ago
- ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3β491Updated 5 months ago
- High-performance retrieval engine for unstructured dataβ1,419Updated last week
- Infinite Bookshelf: Generate entire books in seconds using Groq and Llama3β1,310Updated 6 months ago
- β225Updated 2 weeks ago
- Local semantic search. Stupidly simple.β432Updated 11 months ago
- A simple Python program to implement the search-extract-summarize flow.β269Updated last week
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β328Updated 2 weeks ago
- An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.β285Updated last month
- β435Updated 8 months ago
- Multi-agent that helps you organize and write documents.β333Updated 7 months ago
- An autoagentic AGI that is self-evolving and modular.β955Updated 9 months ago
- Home of the AI workforce - Multi-agent system, AI agents & toolsβ239Updated last month
- HawkinsDB is our take on giving AI systems a more human-like way to store and recall information, inspired by how our own brains work. Baβ¦β294Updated 6 months ago