A set of tools to create synthetically-generated data from documents
☆43Aug 15, 2025Updated 7 months ago
Alternatives and similar repositories for docling-sdg
Users that are interested in docling-sdg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Docling workshops☆41Mar 4, 2026Updated 3 weeks ago
- A Java API for Docling☆90Updated this week
- Build document-native LLM applications☆57Sep 11, 2024Updated last year
- ☆194Mar 9, 2026Updated 2 weeks ago
- Docling core data types and transformations☆236Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simple package to extract text with coordinates from programmatic PDFs☆256Mar 9, 2026Updated 2 weeks ago
- MCP server for retrieval augmented thinking and problem solving☆22Aug 13, 2025Updated 7 months ago
- Examples using the Deep Search functionalities☆85Jan 29, 2025Updated last year
- Wrangler Compatible Cloudflare Deployment API☆19Aug 25, 2025Updated 7 months ago
- Repository for ACL paper: "Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs"☆17Jul 1, 2024Updated last year
- An external provider for Llama Stack allowing for the use of RamaLama for inference.☆21Dec 22, 2025Updated 3 months ago
- Transform Claude Code transcript JSONL files into readable terminal and HTML formats.☆70Feb 10, 2026Updated last month
- Framework for deploying configurable AI agents with real-time streaming and tool execution.☆40Sep 18, 2025Updated 6 months ago
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Thr…☆34Dec 5, 2025Updated 3 months ago
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆222Jan 24, 2025Updated last year
- ☆23Mar 21, 2025Updated last year
- An MCP server that provides image recognition 👀 capabilities using Anthropic and OpenAI vision APIs☆35Apr 12, 2025Updated 11 months ago
- TOON as DSPy adapter☆25Feb 1, 2026Updated last month
- 让更多的小伙伴投入到开源事业中,让独立的设计能力帮助更多开发者☆15May 31, 2024Updated last year
- This project makes running the InstructLab large language model (LLM) fine-tuning process easy and flexible on OpenShift☆27Aug 27, 2025Updated 6 months ago
- Scalable Kubernetes-native implementation of the Open Data Fabric protocol for global collaborative data processing☆22Updated this week
- Evaluation framework for document processing models and services.☆66Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- An LLM based shell assistant that knows your usual shell commands.☆17Jul 18, 2025Updated 8 months ago
- RDP Credential Provider☆12Oct 29, 2025Updated 4 months ago
- ACE (Adaptive Code Evolution) is an AI-powered system for code analysis and optimization.☆12Nov 4, 2025Updated 4 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆52Feb 27, 2025Updated last year
- Boosting Natural Language Generation from Instructions with Meta-Learning☆11Dec 20, 2022Updated 3 years ago
- A simple POC of FastRTC, a framework to use voice mode in python!☆36Apr 7, 2025Updated 11 months ago
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- ☆17Jun 8, 2025Updated 9 months ago
- Lightweight piece tokenization library☆12Apr 15, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆31Apr 23, 2025Updated 11 months ago
- Building an open-source alternative to v0 and Lovable.☆63Feb 2, 2026Updated last month
- This is the repository for the Master of Science thesis titled "GAN-based Matrix Factorization for Recommender Systems".☆10Aug 10, 2020Updated 5 years ago
- OTEL ingestion running on Cloudflare Workers☆49Apr 8, 2025Updated 11 months ago
- ☆14Apr 23, 2025Updated 11 months ago
- How to embed a SwiftUI view in Notification Content Extension☆10Jun 23, 2021Updated 4 years ago
- ☆15Jun 2, 2025Updated 9 months ago