butlerlabs / docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆19Updated 2 years ago
Alternatives and similar repositories for docai:
Users that are interested in docai are comparing it to the libraries listed below
- ☆21Updated 10 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆9Updated 5 months ago
- ☆12Updated 8 months ago
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆36Updated last year
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆14Updated 4 years ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆20Updated 3 months ago
- ☆11Updated last year
- Automated PDF and text processing with Spacy and NLTK; information extraction from text based on grammatical structure; deployed on extra…☆16Updated 2 years ago
- Repository for deepdoctection tutorial notebooks☆40Updated last month
- AI_Powered_Dev_Search_Engine☆12Updated 10 months ago
- Chat Complex PDF with Tables Using IBM WatsonX, Langchain and LlamaParser.☆11Updated 8 months ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last week
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆61Updated this week
- Pandas-LLM☆35Updated last year
- Unstract's interface to LLMs, Embeddings and VectorDBs.☆18Updated 5 months ago
- ☆11Updated last year
- Controllable-RAG-Agent using Langgraph☆12Updated 5 months ago
- Pipeline for converting PDFs to raw text with PaddleOCR☆21Updated last year
- a streaming markdown component for streamlit with LaTeX, Mermaid, Table, code support. A drop-in replacement for st.markdown.☆13Updated 3 months ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆63Updated this week
- Contains Google Colab or Jupyter notebooks, as well as other associated files for my Medium blogposts.☆34Updated 7 months ago
- ☆15Updated 3 years ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆37Updated 11 months ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!☆13Updated this week
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated 11 months ago
- A multi-agent business consultant app on streamlit implemented using crewAI☆15Updated 6 months ago
- ☆16Updated 3 years ago
- Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings f…☆12Updated 4 months ago
- Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more☆43Updated 10 months ago