butlerlabs / docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for docai
- ☆21Updated 8 months ago
- Repository for deepdoctection tutorial notebooks☆39Updated 4 months ago
- ☆11Updated 6 months ago
- ☆15Updated 3 years ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆11Updated 9 months ago
- Lightweight Non-Parametric Embedding Fine-Tuning☆17Updated last month
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆33Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆46Updated this week
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆10Updated last year
- Integrate AI-powered Document Analysis Pipelines☆62Updated this week
- Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings f…☆12Updated 2 months ago
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆11Updated 3 years ago
- ☆11Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆11Updated 3 months ago
- A Streamlit app for showing a TimelineJS about the history of Natural Language Processing☆24Updated last year
- From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.☆11Updated 6 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆10Updated 3 months ago
- CTE: Contextualized Table Extraction Dataset☆17Updated last year
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆44Updated 3 months ago
- ☆13Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated last month
- Implementation of the DocLLM paper for Llama models.☆12Updated 3 weeks ago
- ☆20Updated 9 months ago
- Chat with Qwen2-VL. Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆9Updated 2 months ago
- Tools for merging pretrained large language models.☆19Updated 5 months ago
- An autonomous Mall assistant that can answer user queries using tools. Powered by LLMs.☆14Updated last year
- Integrated LLM-based document and data Q&A with knowledge graph visualization☆19Updated 11 months ago
- Chat Complex PDF with Tables Using IBM WatsonX, Langchain and LlamaParser.☆11Updated 7 months ago
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆62Updated last week
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 6 months ago