butlerlabs / docaiLinks
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆20Updated 2 years ago
Alternatives and similar repositories for docai
Users that are interested in docai are comparing it to the libraries listed below
Sorting:
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆70Updated this week
- ☆22Updated last year
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆22Updated 9 months ago
- Repository for deepdoctection tutorial notebooks☆45Updated last month
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆14Updated 4 years ago
- Universal text classifier for generative models☆24Updated 11 months ago
- a streaming markdown component for streamlit with LaTeX, Mermaid, Table, code support. A drop-in replacement for st.markdown.☆20Updated 5 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆126Updated last year
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆13Updated 2 weeks ago
- Query, ask and chat with a document-index via transformer models!☆17Updated 2 years ago
- Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings f…☆13Updated 10 months ago
- Pandas-LLM☆46Updated 2 years ago
- ASTChunk is a Python toolkit for code chunking using Abstract Syntax Trees (ASTs), designed to create structurally sound and meaningful c…☆30Updated 2 weeks ago
- Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)☆21Updated 2 months ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Updated last year
- CRUD Word documents with Python☆11Updated 7 months ago
- AI_Powered_Dev_Search_Engine☆12Updated last year
- Lightweight Non-Parametric Embedding Fine-Tuning☆25Updated 9 months ago
- ☆47Updated 9 months ago
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆24Updated last year
- Unstract's interface to LLMs, Embeddings and VectorDBs.☆18Updated 11 months ago
- GLiNER model in a FastAPI microservice.☆44Updated 7 months ago
- ☆13Updated 2 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆13Updated 11 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 8 months ago
- ☆11Updated 2 years ago
- ☆49Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆37Updated last year
- ☆13Updated last year