A model(ing framework) for sample efficient OCR
☆64Apr 7, 2023Updated 2 years ago
Alternatives and similar repositories for effocr
Users that are interested in effocr are comparing it to the libraries listed below
Sorting:
- Noise-robust de-duplication at scale☆19Apr 9, 2023Updated 2 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- ☆10Oct 2, 2024Updated last year
- Official repository accompaying the ICDAR 2023 paper☆13Oct 3, 2023Updated 2 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆14Mar 2, 2024Updated 2 years ago
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated 11 months ago
- uncover old chinese textual parallels based on sound☆15Feb 23, 2026Updated last week
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- Cross-lingual learning in scene text recognition (ICASSP2024)☆18Sep 29, 2024Updated last year
- ☆15Mar 8, 2024Updated last year
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- time-series row column classification☆14Jan 7, 2022Updated 4 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 2 years ago
- Named Entity Recognition☆19Feb 13, 2026Updated 3 weeks ago
- Repository for contributions for Data Generation for Post-OCR correction of Cyrillic handwriting paper☆21Nov 27, 2023Updated 2 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆39Dec 2, 2023Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆24Jul 12, 2025Updated 7 months ago
- Tensorflow port implementation of Single Headed Attention RNN☆16Feb 1, 2020Updated 6 years ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- ☆40Jun 15, 2024Updated last year
- A context-based spellchecker for correcting OCR output.☆21Feb 3, 2023Updated 3 years ago
- Deep Learning Paper Implementations in PyTorch☆18Mar 26, 2025Updated 11 months ago
- My NER Experiments with ModernBERT and Ettin☆26Jul 17, 2025Updated 7 months ago
- Multilingual Open Text☆25May 8, 2025Updated 9 months ago
- Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations☆24Oct 4, 2024Updated last year
- A collection of various LLM sampling methods implemented in pure Pytorch☆27Dec 9, 2024Updated last year
- Official PyTorch implementation of PyramidTabNet: Transformer-based Table Recognition in Image-based Documents☆28Oct 5, 2024Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- 利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure☆28Feb 23, 2024Updated 2 years ago
- [CVPR'24] Handwritten Mathematical Expressions Generation (HMEG)☆30Jun 3, 2024Updated last year
- A collection of notebooks for Natural Language Processing☆25Jan 13, 2025Updated last year
- [NeurIPS 2025] MergeBench: A Benchmark for Merging Domain-Specialized LLMs☆43Feb 11, 2026Updated 3 weeks ago
- German Parliamentary Corpus (GerParCor)☆30Jan 14, 2026Updated last month
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Oct 9, 2023Updated 2 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- Template repository for research papers.☆116Nov 2, 2022Updated 3 years ago
- Document Layout Analysis☆398Updated this week