dell-research-harvard/effocr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dell-research-harvard/effocr)

dell-research-harvard / effocr

A model(ing framework) for sample efficient OCR

☆65

Alternatives and similar repositories for effocr

Users that are interested in effocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dell-research-harvard / NEWS-COPY
View on GitHub
Noise-robust de-duplication at scale
☆19Apr 9, 2023Updated 3 years ago
Layout-Parser / annotation-service
View on GitHub
☆20Jul 22, 2021Updated 5 years ago
MARXdown / MARXdown.github.io
View on GitHub
Inital build of digital edition of Capital Volume 1 using Ed. and hypothes.is
☆13Jan 20, 2023Updated 3 years ago
ihdia / seamformer
View on GitHub
Official repository accompaying the ICDAR 2023 paper
☆14Oct 3, 2023Updated 2 years ago
ingmarboeschen / JATSdecoder
View on GitHub
A text extraction and manipulation toolset for NISO-JATS coded XML files
☆22Apr 10, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
htrc / HTRC-WorksetToolkit
View on GitHub
Python SDK for Data API and Solr API access
☆12May 3, 2024Updated 2 years ago
ottowg / gsap-ner
View on GitHub
☆10Oct 2, 2024Updated last year
Pleias / OCRoscope
View on GitHub
Small python package to measure OCR quality and other related metrics.
☆26Feb 19, 2024Updated 2 years ago
gowitheflow-1998 / Pixel-Linguist
View on GitHub
☆15Mar 8, 2024Updated 2 years ago
Doreenruirui / ACL2018_Multi_Input_OCR
View on GitHub
☆13Jun 25, 2019Updated 7 years ago
skhiggins / StataTools
View on GitHub
My user-written commands/packages for data analysis in Stata
☆21Oct 12, 2020Updated 5 years ago
Living-with-machines / nnanno
View on GitHub
nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset
☆17Oct 16, 2024Updated last year
sergiocorreia / quipucamayoc
View on GitHub
dev repo for article
☆33Mar 14, 2023Updated 3 years ago
L597383845 / row-col-table-recognition
View on GitHub
time-series row column classification
☆14Jan 7, 2022Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
elliottash / text_ml_course_2018
View on GitHub
Slides and jupter notebooks for course on text analysis and machine learning for social science
☆26Aug 18, 2021Updated 4 years ago
rdahis / paper_template
View on GitHub
Template repository for research papers.
☆119Nov 2, 2022Updated 3 years ago
TurkuNLP / bert-eval
View on GitHub
☆10Oct 15, 2019Updated 6 years ago
NathanGodey / headless-lm
View on GitHub
Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…
☆29Apr 17, 2024Updated 2 years ago
yufanchen96 / RoDLA
View on GitHub
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
☆39Mar 26, 2025Updated last year
ku21fan / CLL-STR
View on GitHub
Cross-lingual learning in scene text recognition (ICASSP2024)
☆19Sep 29, 2024Updated last year
cisnlp / MEXA
View on GitHub
[ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
☆11Apr 6, 2025Updated last year
johnning2333 / M2Doc
View on GitHub
☆43Jun 15, 2024Updated 2 years ago
DavisWeaver / crosstalkr
View on GitHub
R package for the identification of functionally important subnetworks
☆14May 6, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
catseye / Guten-gutter
View on GitHub
Strips boilerplate from Project Gutenberg text files
☆17Jul 28, 2021Updated 5 years ago
iesl / s-diora
View on GitHub
☆12Jan 29, 2021Updated 5 years ago
dbrainio / CyrillicHandwritingPOC
View on GitHub
Repository for contributions for Data Generation for Post-OCR correction of Cyrillic handwriting paper
☆23Nov 27, 2023Updated 2 years ago
ltgoslo / simple_elmo_training
View on GitHub
Minimal code to train ELMo models in recent versions of TensorFlow
☆14Jun 16, 2026Updated last month
facebookresearch / coocmap
View on GitHub
code for paper "Accessing higher dimensions for unsupervised word translation"
☆23Jun 26, 2023Updated 3 years ago
UCSC-REAL / FLAT
View on GitHub
[ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data
☆14Feb 26, 2025Updated last year
Pinnacle-Technology-Inc / Morelia
View on GitHub
Morelia is a free, open-source Python API for Pinnacle Technology devices.
☆11Jul 22, 2026Updated last week
glenrobson / iiif2annos
View on GitHub
OCR a IIIF images in a manifest and generate annotations
☆28May 1, 2026Updated 2 months ago
trusthlt / eacl24-german-legal-questions
View on GitHub
Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24
☆17Mar 2, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
coderigo / stata-git
View on GitHub
Installs and manages Stata programs tracked as git repositories.
☆11Sep 13, 2017Updated 8 years ago
JMSLab / eventstudyr
View on GitHub
☆28Apr 1, 2026Updated 3 months ago
cverluise / patentcity
View on GitHub
Innovation across ages
☆74Mar 5, 2023Updated 3 years ago
nytimes / haiti-debt
View on GitHub
Historical data on Haiti’s debt payments to France collected by The New York Times.
☆23May 20, 2022Updated 4 years ago
mattia-decao / hiero-transformer
View on GitHub
☆15Nov 3, 2024Updated last year
phaiptt125 / newspaper_project
View on GitHub
A supplementary material to "The Evolution of Work in the United States"
☆12Jun 23, 2021Updated 5 years ago
muhd-umer / pyramidtabnet
View on GitHub
Official PyTorch implementation of PyramidTabNet: Transformer-based Table Recognition in Image-based Documents
☆28Jun 8, 2026Updated last month