Pleias / OCRoscopeView external linksLinks
Small python package to measure OCR quality and other related metrics.
☆27Feb 19, 2024Updated last year
Alternatives and similar repositories for OCRoscope
Users that are interested in OCRoscope are comparing it to the libraries listed below
Sorting:
- ☆10Oct 2, 2024Updated last year
- This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and appl…☆19Jun 5, 2025Updated 8 months ago
- Exploring some issues related to churn☆17Mar 19, 2024Updated last year
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆22Nov 10, 2024Updated last year
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆13Mar 2, 2024Updated last year
- Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"☆21Oct 1, 2025Updated 4 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- Experiment on metadata extraction using large language models such as GPT-3☆12Feb 1, 2023Updated 3 years ago
- ☆162Dec 2, 2024Updated last year
- Named Entity Recognition☆18Apr 9, 2025Updated 10 months ago
- Noise-robust de-duplication at scale☆19Apr 9, 2023Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆23Jul 12, 2025Updated 7 months ago
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆23Dec 16, 2025Updated last month
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- Python library to use Pleias-RAG models☆68May 1, 2025Updated 9 months ago
- The official repo for "Unified Domain Adaptive Semantic Segmentation" (IEEE TPAMI 2025)☆33Aug 14, 2025Updated 5 months ago
- Multilingual Open Text☆25May 8, 2025Updated 9 months ago
- Nearly Inference Free Embeddings: make your RAG queries 500x faster☆70Updated this week
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆76Jan 26, 2026Updated 2 weeks ago
- Implementation of Nested Named Entity Recognition using Flair☆24Oct 29, 2021Updated 4 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆56May 30, 2023Updated 2 years ago
- Code and data for the paper "Temporal Attention for Language Models", Findings of NAACL 2022☆26Apr 14, 2023Updated 2 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- SeeGULL is a broad-coverage stereotype dataset in English containing stereotypes about identity groups spanning 178 countries across 8 di…☆38Sep 25, 2023Updated 2 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆70Jan 27, 2023Updated 3 years ago
- TSheets API Documentation☆12Jan 18, 2024Updated 2 years ago
- High Security Surveillance Camera using OpenCV, Python & Arduino☆12Jun 20, 2020Updated 5 years ago
- Code for the ACL 2022 paper "Contextual Representation Learning beyond Masked Language Modeling"☆33Oct 23, 2022Updated 3 years ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 3 months ago
- ☆75Jul 2, 2021Updated 4 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Apr 21, 2021Updated 4 years ago
- Sonos plugin for homebridge: https://github.com/nfarina/homebridge☆14Apr 23, 2017Updated 8 years ago
- Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binari…☆15Aug 25, 2017Updated 8 years ago
- Automated Quality Control for Dialogflow CX Agents☆14May 3, 2024Updated last year
- ☆11Nov 10, 2020Updated 5 years ago
- Named entity recognition for the legal domain☆43Jun 1, 2021Updated 4 years ago
- ☆15Aug 8, 2024Updated last year