butlerlabs / docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆20Updated 2 years ago
Alternatives and similar repositories for docai:
Users that are interested in docai are comparing it to the libraries listed below
- ☆22Updated 11 months ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆66Updated this week
- AI_Powered_Dev_Search_Engine☆12Updated 11 months ago
- Repository for deepdoctection tutorial notebooks☆43Updated 3 months ago
- Automated PDF and text processing with Spacy and NLTK; information extraction from text based on grammatical structure; deployed on extra…☆16Updated 2 years ago
- ☆12Updated 10 months ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- ☆11Updated last year
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆12Updated 2 years ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆38Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 4 months ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 10 months ago
- Tool to take your ML model from local to production with one-line of code.☆25Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- Pandas-LLM☆38Updated last year
- End to End MLOps☆10Updated 4 years ago
- ☆14Updated 8 months ago
- Universal text classifier for generative models☆22Updated 7 months ago
- A swarm of LLM agents that will help you test, document, and productionize your code!☆14Updated 3 weeks ago
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆22Updated last year
- DocLLM: A layout-aware generative language model for multimodal document understanding☆121Updated last year
- Tools for merging pretrained large language models.☆19Updated 8 months ago
- Large-scale query-focused multi-document Summarization dataset☆10Updated 3 years ago
- Transforming textual descriptions into process models using deep learning☆13Updated 5 years ago
- ☆15Updated 3 years ago
- ☆45Updated 5 months ago
- Lightweight Non-Parametric Embedding Fine-Tuning☆23Updated 5 months ago