iamarunbrahma / pdf-to-markdownLinks
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.
☆107Updated last year
Alternatives and similar repositories for pdf-to-markdown
Users that are interested in pdf-to-markdown are comparing it to the libraries listed below
Sorting:
- An alternative AI assistant for Microsoft Office that works with your favorite LLM API☆87Updated this week
- Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, py…☆165Updated 5 months ago
- AI Document Assistant☆89Updated 7 months ago
- Chat with PDF files with source highlights☆149Updated last year
- Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extr…☆275Updated 2 weeks ago
- MCP server for enabling LLM applications to perform deep research via the MCP protocol☆311Updated 2 months ago
- A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted…☆188Updated 6 months ago
- Parse PDFs into markdown using Vision LLMs☆456Updated 3 months ago
- A set of re-usable AI agent for document processing☆98Updated last year
- An advanced retrieval system that combines semantic vector search with token-based search, using contextual chunking and knowledge graphs…☆45Updated last year
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆51Updated last year
- an AI interaction tool with RAG hybrid search, conversation context, web content processing and structured data analysis with LLM / GPT☆211Updated 7 months ago
- Reliable RAG setup that uses Semantic Double Merging Chunking from llamaindex, Qdrant Hybrid Search, colBERT for reranking and Google Gem…☆42Updated last year
- An agentic AI application that allows you to chat with your papers and gather also information from papers on ArXiv and on PubMed☆152Updated 8 months ago
- Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable p…☆87Updated last year
- A framework for agentic workflow creation and deployment☆258Updated last year
- An Automated AI-Powered Prompt Optimization Framework☆210Updated last year
- ☆24Updated last year
- ☆125Updated 11 months ago
- A simple script that can run in the background, uses the whisper API to transcribe text into ANY application☆98Updated last year
- This is an advanced Python tool that empowers you to effortlessly draft customizable PowerPoint slides using the Generative Pre-trained T…☆147Updated last year
- Open Deep Researcher with openai compatible endpoint, now completely local with ollama, local playwright via searxng with citations and p…☆153Updated 10 months ago
- A document analysis tool built with Streamlit and Microsoft MarkItDown. Extract and analyze content from multiple document formats with o…☆64Updated last year
- Find your files with natural language and ask questions.☆57Updated last week
- ☆69Updated last year
- Groqqle is a powerful web search and content summarization tool built with Python, leveraging Groq's LLM API for advanced natural languag…☆150Updated 10 months ago
- ☆106Updated last week
- A fun project where I use the power of AI to analyze a PDF. The AI extracts key information based on the user's instructions and selectio…☆85Updated last year
- This repository contains custom pipelines developed for the OpenWebUI framework, including advanced workflows such as long-term memory fi…☆84Updated 8 months ago
- Automatically generate engaging AI podcasts from nothing but an episode title.☆142Updated 6 months ago