impira / docquery
An easy way to extract information from documents
☆1,717Updated last year
Related projects ⓘ
Alternatives and complementary repositories for docquery
- A Repo For Document AI☆2,591Updated this week
- Classify and extract structured data with LLMs☆416Updated last year
- LLM(😽)☆1,628Updated 2 months ago
- Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022☆5,858Updated 4 months ago
- A Bulletproof Way to Generate Structured JSON from Language Models☆4,464Updated 8 months ago
- The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact…☆1,402Updated 5 months ago
- 🦘 Explore multimedia datasets at scale☆1,042Updated last month
- A language for constraint-guided and efficient LLM programming.☆3,699Updated 5 months ago
- Open-source natural language enrichments at your fingertips.☆451Updated 7 months ago
- Structured and typehinted GPT responses in Python☆738Updated 3 months ago
- Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engin…☆3,271Updated 8 months ago
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,318Updated 4 months ago
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.☆1,101Updated last month
- Zoomable, animated scatterplots in the browser that scales over a billion points☆1,044Updated last week
- Software that makes labeling PDFs easy.☆391Updated 6 months ago
- Seamlessly integrate LLMs as Python functions☆2,054Updated this week
- Stealth browsers as a service. Connect your scraper or automation to a fleet of cloud-hosted browsers configured for reliability and stea…☆2,302Updated last month
- AI code-writing assistant that understands data content☆2,244Updated 9 months ago
- 🦙 Integrating LLMs into structured NLP pipelines☆1,136Updated 3 months ago
- docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.☆3,883Updated this week
- 🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative☆880Updated last year
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…☆941Updated last month
- Blazing fast framework for fine-tuning similarity learning models☆643Updated last month
- An open-source ML pipeline development platform☆974Updated last month
- A school for camelids☆1,208Updated last year
- A tiny nearest-neighbor embedding database built with SQLite and Pytorch. (In development!)☆775Updated last year
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆298Updated last year
- ☆2,140Updated 2 months ago
- A curated list of resources for Document Understanding (DU) topic☆1,310Updated last year
- [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings☆1,869Updated 2 months ago