butlerlabs / docaiLinks
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆20Updated 3 years ago
Alternatives and similar repositories for docai
Users that are interested in docai are comparing it to the libraries listed below
Sorting:
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆80Updated last week
- Universal text classifier for generative models☆24Updated last year
- A chatbot made using the Chatterbot library in Python and locally hosted using Streamlit. Dataset used were collected during ConvAI2 comp…☆16Updated 4 years ago
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Updated 5 years ago
- ☆22Updated last year
- ☆53Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated last year
- Repository for deepdoctection tutorial notebooks☆48Updated last week
- Pandas-LLM☆46Updated 2 years ago
- Evaluation framework for document processing models and services.☆60Updated this week
- DocLLM: A layout-aware generative language model for multimodal document understanding☆133Updated 2 years ago
- CRUD Word documents with Python☆13Updated last month
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 3 months ago
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K …☆86Updated last year
- Benchmarks for Business Document Foundation Models☆10Updated last year
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆51Updated 2 weeks ago
- Code for the EMNLP'24 paper "Learning to Extract Structured Entities Using Language Models"☆48Updated 9 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated last year
- ☆15Updated last year
- ☆12Updated last week
- Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP☆49Updated 3 years ago
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆37Updated 2 years ago
- LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development☆20Updated 2 years ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated 2 years ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Updated 2 years ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆45Updated last year
- ☆39Updated last year
- Full-fledged Data Exploration Tool for Label Studio☆48Updated last year
- Search PDFs using Jina, DocArray and Jina Hub☆57Updated 3 years ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆33Updated last year