NanoNets / docextLinks
An on-premises, OCR-free unstructured data extraction and benchmarking toolkit. (https://idp-leaderboard.org/)
☆508Updated last week
Alternatives and similar repositories for docext
Users that are interested in docext are comparing it to the libraries listed below
Sorting:
- python package to parse pdfs with different parsers☆159Updated 5 months ago
- SmolDocling OCR App built using SmolDocling 256M Model and Streamlit.☆145Updated 2 months ago
- https://no-ocr.com/about☆131Updated 4 months ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆588Updated 3 weeks ago
- OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple…☆609Updated last month
- The open-source RAG platform☆180Updated this week
- 🤖 A visualization Model Context Protocol server for generating visual charts using @antvis.☆668Updated this week
- [ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths☆483Updated last week
- A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more☆229Updated this week
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆194Updated 7 months ago
- The world's first Full-Stack Open-Source General AI Agent☆158Updated this week
- Convert Everything to PDF☆143Updated 3 weeks ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)☆645Updated 2 weeks ago
- Generate Web Pages and Components with text prompts, with Local Models. (or Cloud Models, if you want) - now supports Thinking Models!☆155Updated 3 weeks ago
- Parse PDFs into markdown using Vision LLMs☆376Updated 3 months ago
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what…☆315Updated 3 months ago
- Speakr is a personal, self-hosted web application designed for transcribing audio recordings☆570Updated this week
- ☆116Updated last month
- A unified hub server that organizes multiple MCP servers into distinct streamable HTTP (SSE) endpoints☆387Updated this week
- ☆1,538Updated 2 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆479Updated 2 months ago
- Library for model distillation☆142Updated 3 months ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel…☆121Updated 3 months ago
- A General-Purpose AI Agent ✨☆348Updated this week
- Secretary is an AI-powered tool that analyzes social media content from specified accounts and delivers results via WeChat. It supports c…☆319Updated 2 weeks ago
- ContextGem: Effortless LLM extraction from documents☆1,050Updated this week
- ☆458Updated 2 months ago
- Unsloth Fine-tuning Notebooks for Google Colab, Kaggle, Hugging Face and more.☆347Updated this week
- Automate desktop apps like a browser. AI-native GUI automation for Windows. Fast, reliable, agent-ready.☆522Updated this week
- recursive rag with r1 reasoning☆307Updated last week