iamarunbrahma / pdf-to-markdownLinks
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.
☆105Updated last year
Alternatives and similar repositories for pdf-to-markdown
Users that are interested in pdf-to-markdown are comparing it to the libraries listed below
Sorting:
- Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, py…☆159Updated 4 months ago
- A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted…☆180Updated 6 months ago
- ☆74Updated last year
- A document analysis tool built with Streamlit and Microsoft MarkItDown. Extract and analyze content from multiple document formats with o…☆64Updated last year
- an AI interaction tool with RAG hybrid search, conversation context, web content processing and structured data analysis with LLM / GPT☆210Updated 6 months ago
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆51Updated last year
- Chat with PDF files with source highlights☆150Updated last year
- A set of re-usable AI agent for document processing☆97Updated last year
- An Automated AI-Powered Prompt Optimization Framework☆208Updated last year
- A Multi-Agent AI Tool that creates beautiful presentations with voice-overs 🎦🔥☆184Updated 10 months ago
- A fun project where I use the power of AI to analyze a PDF. The AI extracts key information based on the user's instructions and selectio…☆85Updated last year
- Automatically generate engaging AI podcasts from nothing but an episode title.☆142Updated 5 months ago
- Groqqle is a powerful web search and content summarization tool built with Python, leveraging Groq's LLM API for advanced natural languag…☆148Updated 10 months ago
- Corrective RAG demo powerd by Ollama☆109Updated last year
- A simple script that can run in the background, uses the whisper API to transcribe text into ANY application☆98Updated last year
- Graphy v1: A Realtime GraphRAG App using Langchain, Neo4j, GPT-4o, and Streamlit.☆71Updated last year
- like firecrawl.dev but free☆50Updated 10 months ago
- Find your files with natural language and ask questions.☆58Updated last month
- Reliable RAG setup that uses Semantic Double Merging Chunking from llamaindex, Qdrant Hybrid Search, colBERT for reranking and Google Gem…☆42Updated last year
- An advanced retrieval system that combines semantic vector search with token-based search, using contextual chunking and knowledge graphs…☆45Updated last year
- Example Pipelines for Open-WebUI☆83Updated 10 months ago
- Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extr…☆270Updated last month
- Co-create PowerPoint slide decks with AI☆309Updated 2 weeks ago
- ☆21Updated last year
- LangGraph-GUI backend with fastapi☆61Updated 2 months ago
- Extract what matters from any media source☆112Updated last month
- Workflows are an event-driven, async-first, step-based way to control the execution flow of AI applications like agents.☆302Updated this week
- ✨ AI interface for tinkerers (Ollama, Haystack RAG, Python)☆473Updated 4 months ago
- Contextual Doc Retrieval is a Python-based system leveraging OpenAI GPT-4o and Cohere for re-ranking and query expansion, combined with B…☆49Updated last year
- LLM powered local Search Engine☆29Updated last year