butlerlabs / docaiLinks
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆20Updated 3 years ago
Alternatives and similar repositories for docai
Users that are interested in docai are comparing it to the libraries listed below
Sorting:
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆80Updated last week
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Updated 5 years ago
- Repository for deepdoctection tutorial notebooks☆50Updated 3 weeks ago
- ☆22Updated last year
- DocLLM: A layout-aware generative language model for multimodal document understanding☆137Updated 2 years ago
- ☆15Updated 4 years ago
- Search PDFs using Jina, DocArray and Jina Hub☆57Updated 3 years ago
- CRUD Word documents with Python☆13Updated last month
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆51Updated last week
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆53Updated 10 months ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆44Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆37Updated 2 years ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆29Updated 2 years ago
- From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.☆12Updated last year
- Evaluation framework for document processing models and services.☆62Updated last week
- Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP☆49Updated 3 years ago
- CTE: Contextualized Table Extraction Dataset☆17Updated 2 years ago
- ☆96Updated 5 years ago
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆136Updated 3 months ago
- ☆11Updated 2 years ago
- A chatbot made using the Chatterbot library in Python and locally hosted using Streamlit. Dataset used were collected during ConvAI2 comp…☆16Updated 4 years ago
- Explore from keyword search to dense retrieval and reranking, which injects the intelligence of LLMs into your search system, making it f…☆14Updated 2 years ago
- ☆12Updated last week
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K …☆86Updated last year
- a streaming markdown component for streamlit with LaTeX, Mermaid, Table, code support. A drop-in replacement for st.markdown.☆26Updated 11 months ago
- Universal text classifier for generative models☆24Updated last year
- Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents☆47Updated 4 years ago
- Unstract's interface to LLMs, Embeddings and VectorDBs.☆18Updated last year
- Encountering 14 different Naive RAG fails and using KG to solve it☆19Updated last month