☆206Mar 1, 2026Updated 3 weeks ago
Alternatives and similar repositories for unstructured-inference
Users that are interested in unstructured-inference are comparing it to the libraries listed below
Sorting:
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆29Apr 7, 2023Updated 2 years ago
- Preprocessing pipeline notebooks and API supporting text extraction from SEC documents☆149Jan 1, 2024Updated 2 years ago
- ☆19May 23, 2023Updated 2 years ago
- Pipeline for converting PDFs to raw text with PaddleOCR☆22Aug 21, 2023Updated 2 years ago
- ☆890Updated this week
- Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…☆14,282Updated this week
- A Python client for the Unstructured Platform API☆114Updated this week
- ☆18Updated this week
- Incredibly descriptive audiovisual summaries for videos☆41Aug 2, 2024Updated last year
- Supercharge huggingface transformers with model parallelism.☆78Jul 23, 2025Updated 7 months ago
- Rust implementation of Surya☆66Mar 1, 2025Updated last year
- ☆22Apr 14, 2019Updated 6 years ago
- ☆26Mar 28, 2025Updated 11 months ago
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆69May 9, 2023Updated 2 years ago
- Apify's reusable github workflows☆14Updated this week
- Developer APIs to Accelerate LLM Projects☆1,744Oct 18, 2024Updated last year
- Track and Collaborate on ML & AI Experiments.☆44Mar 10, 2025Updated last year
- 阅读顺序、Layoutreader☆19May 8, 2025Updated 10 months ago
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,878Jun 24, 2024Updated last year
- 适合于开发人员的运维管理平台(基于ASP.NET Core Blazor 5语言编写)☆11Feb 18, 2024Updated 2 years ago
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,681Aug 15, 2024Updated last year
- A Helm chart repo to install persistent BinderHub☆19Dec 13, 2022Updated 3 years ago
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆48Dec 14, 2024Updated last year
- ☆371Sep 7, 2025Updated 6 months ago
- ☆119Dec 18, 2024Updated last year
- Test-Time Memory Framework: Control Hallucinations in Foundation Models☆11Nov 4, 2025Updated 4 months ago
- Efficient vector database for hundred millions of embeddings.☆212May 17, 2024Updated last year
- Document Artifical Intelligence☆202Sep 28, 2025Updated 5 months ago
- Dataset for EMNLP'23 Paper "DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading"☆11Oct 25, 2023Updated 2 years ago
- This is an example of creating an AI agent with flowchart☆12Jul 22, 2024Updated last year
- Repository for deepdoctection tutorial notebooks☆52Jan 1, 2026Updated 2 months ago
- ☆40Sep 26, 2020Updated 5 years ago
- Stream live plots to a matplotlib figure☆81Apr 18, 2025Updated 11 months ago
- A Python toolkit for analyzing machine learning models and datasets.☆79Sep 8, 2023Updated 2 years ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆45Apr 3, 2024Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆60Apr 11, 2024Updated last year
- Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.☆99Oct 8, 2024Updated last year
- The goal is to pilot Microsoft Cognitive Services to unlock the strategic value of UN unstructured content by building on AI and semantic…☆16Jul 6, 2023Updated 2 years ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆110Jul 29, 2025Updated 7 months ago