docling-project / docling-evalLinks
Evaluation framework for document processing models and services.
☆34Updated this week
Alternatives and similar repositories for docling-eval
Users that are interested in docling-eval are comparing it to the libraries listed below
Sorting:
- ☆10Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆63Updated 2 weeks ago
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆33Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated last year
- Python library to use Pleias-RAG models☆62Updated 4 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆26Updated 9 months ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆21Updated 2 months ago
- ☆49Updated 7 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Updated last year
- ☆41Updated 5 months ago
- ☆40Updated last week
- ☆13Updated 9 months ago
- Code for the EMNLP'24 paper "Learning to Extract Structured Entities Using Language Models"☆45Updated 5 months ago
- ☆14Updated 11 months ago
- Pre-train Static Word Embeddings☆85Updated 2 weeks ago
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆50Updated last week
- My NER Experiments with ModernBERT and Ettin☆22Updated 2 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- ☆19Updated last month
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆33Updated last month
- PyTorch implementation for MRL☆19Updated last year
- Enhaced version of Wikiextrator: A wikipedia dumps extractor☆19Updated this week
- ☆68Updated last month
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…☆42Updated last year
- PyLate efficient inference engine☆64Updated last week
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆24Updated last week
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated this week
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆42Updated 6 months ago
- ☆22Updated 7 months ago