DS4SD / docling-parse
Simple package to extract text with coordinates from programmatic PDFs
☆27Updated this week
Related projects ⓘ
Alternatives and complementary repositories for docling-parse
- ☆41Updated this week
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆23Updated 3 weeks ago
- A python library to define and validate data types in Docling.☆34Updated this week
- Build document-native LLM applications☆51Updated 2 months ago
- Running Docling as an API service☆16Updated last month
- GPT-4 Level Conversational QA Trained In a Few Hours☆55Updated 3 months ago
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Updated 10 months ago
- Self-host LLMs with vLLM and BentoML☆74Updated last week
- ☆43Updated 4 months ago
- Examples using the Deep Search functionalities☆47Updated 3 months ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆48Updated 2 months ago
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generation☆94Updated this week
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆53Updated 3 weeks ago
- Data preparation code for Amber 7B LLM☆83Updated 6 months ago
- Evaluation of bm42 sparse indexing algorithm☆62Updated 4 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆77Updated last month
- Vector Database with support for late interaction and token level embeddings.☆54Updated last month
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆40Updated 3 weeks ago
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆37Updated 4 months ago
- ☆17Updated 7 months ago
- Implementation of nougat that focuses on processing pdf locally.☆73Updated 6 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆113Updated 10 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆132Updated this week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆62Updated 2 weeks ago
- Enhancing Translation with RAG-Powered Large Language Models☆65Updated last month
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- I have explained how to create superior RAG pipeline for complex pdfs using LlamaParse. We can extract text and tables from pdf and QA on…☆39Updated 8 months ago
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆69Updated last month
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆20Updated 8 months ago