docling-project / docling-sdgLinks
A set of tools to create synthetically-generated data from documents
☆13Updated 3 weeks ago
Alternatives and similar repositories for docling-sdg
Users that are interested in docling-sdg are comparing it to the libraries listed below
Sorting:
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆53Updated 4 months ago
- A python library to define and validate data types in Docling.☆137Updated last week
- Build document-native LLM applications☆53Updated 8 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆128Updated this week
- Repository for ACL paper: "Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs"☆13Updated 11 months ago
- ☆122Updated this week
- Examples using the Deep Search functionalities☆80Updated 4 months ago
- ☆48Updated this week
- Auto Thinking Mode switch for Qwen3 in Open webui☆62Updated 3 weeks ago
- Try out HallOumi, a state-of-the-art claim verification model in a simple UI!☆34Updated 2 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆70Updated 7 months ago
- Own your AI, search the web with it🌐😎☆87Updated 4 months ago
- Making docling agentic through MCP☆91Updated last week
- An HTTP service intended as a backend for an LLM that can run arbitrary pieces of Python code.☆60Updated last month
- Mycomind Daemon: A mycelium-inspired, advanced Mixture-of-Memory-RAG-Agents (MoMRA) cognitive assistant that combines multiple AI models …☆32Updated 10 months ago
- ☆39Updated last month
- Very minimal (and stateless) agent framework☆44Updated 4 months ago
- Query Expension for Better Query Embedding using LLMs☆51Updated 3 months ago
- ☆41Updated 2 months ago
- Train Large Language Models on MLX.☆81Updated this week
- AI tool that annotates research papers and shows related articles and videos for better understanding☆42Updated 2 weeks ago
- I have explained how to create superior RAG pipeline for complex pdfs using LlamaParse. We can extract text and tables from pdf and QA on…☆45Updated last year
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆37Updated 2 months ago
- An agent that accelerates scientific research by autonomously analyzing provided datasets,great for generating hypotheses, and validating…☆46Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆95Updated 5 months ago
- ☆122Updated 3 months ago
- ☆57Updated 3 months ago
- Jina DeepSearch UI☆111Updated this week
- 👷♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…☆19Updated this week
- A RAG system designed to process documents with multimodal content. It can generate factual, context-aware answers to user queries, based…☆21Updated 5 months ago