dell-research-harvard / effocrView external linksLinks
A model(ing framework) for sample efficient OCR
☆64Apr 7, 2023Updated 2 years ago
Alternatives and similar repositories for effocr
Users that are interested in effocr are comparing it to the libraries listed below
Sorting:
- Noise-robust de-duplication at scale☆19Apr 9, 2023Updated 2 years ago
- ☆10Oct 2, 2024Updated last year
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆13Mar 2, 2024Updated last year
- Official repository accompaying the ICDAR 2023 paper☆13Oct 3, 2023Updated 2 years ago
- ☆10Oct 15, 2019Updated 6 years ago
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated 10 months ago
- uncover old chinese textual parallels based on sound☆15Feb 3, 2026Updated last week
- Cross-lingual learning in scene text recognition (ICASSP2024)☆18Sep 29, 2024Updated last year
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- ☆15Mar 8, 2024Updated last year
- Repository for contributions for Data Generation for Post-OCR correction of Cyrillic handwriting paper☆20Nov 27, 2023Updated 2 years ago
- time-series row column classification☆14Jan 7, 2022Updated 4 years ago
- ☆20Jul 22, 2021Updated 4 years ago
- Named Entity Recognition☆18Apr 9, 2025Updated 10 months ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆39Dec 2, 2023Updated 2 years ago
- RoDLA: Benchmarking the Robustness of Document Layout Analysis Models☆38Mar 26, 2025Updated 10 months ago
- A extension of Transformers library to include T5ForSequenceClassification class.☆40Apr 17, 2023Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆23Jul 12, 2025Updated 7 months ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- ☆40Jun 15, 2024Updated last year
- Chinese character variant converter. 中文异体字转换器。☆21Oct 17, 2025Updated 3 months ago
- Temporary remove unused tokens during training to save ram and speed.☆23Jun 15, 2025Updated 7 months ago
- A context-based spellchecker for correcting OCR output.☆21Feb 3, 2023Updated 3 years ago
- My NER Experiments with ModernBERT and Ettin☆26Jul 17, 2025Updated 6 months ago
- A collection of various LLM sampling methods implemented in pure Pytorch☆26Dec 9, 2024Updated last year
- Official PyTorch implementation of PyramidTabNet: Transformer-based Table Recognition in Image-based Documents☆28Oct 5, 2024Updated last year
- Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations☆24Oct 4, 2024Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- 利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure☆28Feb 23, 2024Updated last year
- Slides and jupter notebooks for course on text analysis and machine learning for social science☆26Aug 18, 2021Updated 4 years ago
- [NeurIPS 2025] MergeBench: A Benchmark for Merging Domain-Specialized LLMs☆41Jan 27, 2026Updated 2 weeks ago
- German Parliamentary Corpus (GerParCor)☆27Jan 14, 2026Updated 3 weeks ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Oct 9, 2023Updated 2 years ago
- Document Layout Analysis☆395Updated this week
- ☆12Nov 3, 2024Updated last year
- SeeGULL is a broad-coverage stereotype dataset in English containing stereotypes about identity groups spanning 178 countries across 8 di…☆38Sep 25, 2023Updated 2 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year