butlerlabs / docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆19Updated 2 years ago
Alternatives and similar repositories for docai:
Users that are interested in docai are comparing it to the libraries listed below
- ☆22Updated 11 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆22Updated 4 months ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆65Updated this week
- Repository for deepdoctection tutorial notebooks☆42Updated 2 months ago
- ☆15Updated 3 years ago
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆36Updated last year
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 9 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆119Updated last year
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆14Updated 4 years ago
- Pandas-LLM☆36Updated last year
- Question Answering dataset generator of Document Visual in English and Chinese☆24Updated last year
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆11Updated 6 months ago
- Automated PDF and text processing with Spacy and NLTK; information extraction from text based on grammatical structure; deployed on extra…☆16Updated 2 years ago
- This Repository consists of all my experiments performed on LayoutLMv3 model.☆29Updated 2 years ago
- Tool to take your ML model from local to production with one-line of code.☆25Updated last year
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆22Updated last year
- GLiNER model in a FastAPI microservice.☆36Updated 2 months ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- ☆11Updated last year
- Pipeline for converting PDFs to raw text with PaddleOCR☆21Updated last year
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆25Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- Luann allows you to create a LLM agent,which has complete memory module (long-term memory, short-term memory) and knowledge module(Variou…☆18Updated last week
- From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.☆12Updated 9 months ago
- Tools for merging pretrained large language models.☆19Updated 8 months ago
- ☆12Updated 9 months ago
- A chatbot made using the Chatterbot library in Python and locally hosted using Streamlit. Dataset used were collected during ConvAI2 comp…☆15Updated 3 years ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆38Updated 11 months ago
- Medical Mixture of Experts LLM using Mergekit.☆20Updated 11 months ago