shoryasethia / markdropLinks
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
☆121Updated 3 months ago
Alternatives and similar repositories for markdrop
Users that are interested in markdrop are comparing it to the libraries listed below
Sorting:
- Reliable RAG setup that uses Semantic Double Merging Chunking from llamaindex, Qdrant Hybrid Search, colBERT for reranking and Google Gem…☆39Updated 6 months ago
- WebRAgent is a retrieval-augmented generation (RAG) web application featuring agent-based query decomposition, vector search with Qdrant,…☆44Updated 3 months ago
- Open-source AI-powered data science platform.☆153Updated this week
- Chat with PDF files with source highlights☆142Updated 6 months ago
- An advanced retrieval system that combines semantic vector search with token-based search, using contextual chunking and knowledge graphs…☆38Updated 8 months ago
- A discovery and compression tool for your Python codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your…☆96Updated 6 months ago
- AI Document Assistant☆79Updated 2 weeks ago
- Convert PowerPoint files into semantically rich text using vision language models☆99Updated 4 months ago
- Archive Agent is an open-source semantic file tracker with OCR + AI search.☆26Updated last month
- An assistant for Slack built with Arcade and Langgraph. Interact with Google Calendar, Mail, Github, Search Engines, Firecrawl and more a…☆94Updated 2 weeks ago
- An Open Source, Claude Code Like Tool, With RAG + Graph RAG + MCP Integration, and Supports Most LLMs (Incomplete But Functional & Usable…☆86Updated this week
- Dabarqus is incredibly fast RAG that runs everywhere.☆60Updated 4 months ago
- FACT – Fast Augmented Context Tools: FACT is a lean retrieval pattern that skips vector search. We cache every static token inside Claude…☆63Updated 3 weeks ago
- Long-Term Memory & Context Management for LLMs☆58Updated 2 weeks ago
- Optimize Document Retrieval with Fine-Tuned KnowledgeBases☆143Updated 3 months ago
- Open-Source RAG app with LLM Observability (Langfuse), support for 100+ providers (LiteLLM), Dockerized, Full Type-checking, 100% Test co…☆154Updated 3 months ago
- Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.☆121Updated 3 weeks ago
- Contextual Doc Retrieval is a Python-based system leveraging OpenAI GPT-4o and Cohere for re-ranking and query expansion, combined with B…☆48Updated 8 months ago
- Generates breakthrough ideas from a single prompt through an 8 stage walkthrough, with optional research proposal paper.☆56Updated 3 months ago
- Retrieval-augmented generation (RAG) for remote & local LLM use☆45Updated last month
- A MCP server connecting to managed indexes on LlamaCloud☆78Updated this week
- Fast local speech-to-text for any app using faster-whisper☆74Updated 2 months ago
- MarinaBox is a toolkit for creating and managing secure, isolated environments for AI agents☆132Updated 4 months ago
- An OpenSource Deep Research library with reasoning☆133Updated 2 weeks ago
- CoexistAI is a modular, developer-friendly research assistant framework . It enables you to build, search, summarize, and automate resear…☆75Updated this week
- Model Context Protocol server implementation for Reddit☆114Updated last week
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆44Updated 6 months ago
- A Model Context Protocol (MCP) server that automates generating LinkedIn post drafts from YouTube videos. This server provides high-quali…☆14Updated 2 months ago
- ☆101Updated 7 months ago
- smart-llm-loader is a lightweight yet powerful Python package that transforms any document into LLM-ready chunks. Spend less time on prep…☆65Updated 4 months ago