A model(ing framework) for sample efficient OCR
☆64Apr 7, 2023Updated 3 years ago
Alternatives and similar repositories for effocr
Users that are interested in effocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Noise-robust de-duplication at scale☆19Apr 9, 2023Updated 3 years ago
- ☆20Jul 22, 2021Updated 4 years ago
- uncover old chinese textual parallels based on sound☆16Apr 27, 2026Updated last week
- Official repository accompaying the ICDAR 2023 paper☆13Oct 3, 2023Updated 2 years ago
- The official Github for the American Stories dataset as in {link}☆133Mar 7, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Inital build of digital edition of Capital Volume 1 using Ed. and hypothes.is☆13Jan 20, 2023Updated 3 years ago
- Cross-lingual learning in scene text recognition (ICASSP2024)☆18Sep 29, 2024Updated last year
- A context-based spellchecker for correcting OCR output.☆21Feb 3, 2023Updated 3 years ago
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- Template for research repository using scons.☆14Updated this week
- Website for Harvard's Gov 50 in Fall 2023☆13Dec 5, 2023Updated 2 years ago
- ☆15Mar 8, 2024Updated 2 years ago
- This is an R wrapper for the APIs on government of India's open data platform - data.gov.in.☆18Sep 22, 2024Updated last year
- dev repo for article☆33Mar 14, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset☆17Oct 16, 2024Updated last year
- time-series row column classification☆14Jan 7, 2022Updated 4 years ago
- Chinese character variant converter. 中文异体字转换器。☆22Oct 17, 2025Updated 6 months ago
- Detect and align similar passages☆121Apr 27, 2026Updated last week
- Slides and jupter notebooks for course on text analysis and machine learning for social science☆26Aug 18, 2021Updated 4 years ago
- ☆10Oct 15, 2019Updated 6 years ago
- Template repository for research papers.☆117Nov 2, 2022Updated 3 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆39Dec 2, 2023Updated 2 years ago
- Discover internal APIs from any website. Captures XHR/fetch calls, extracts auth headers, outputs structured endpoint catalogs. Like open…☆33Feb 5, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- RoDLA: Benchmarking the Robustness of Document Layout Analysis Models☆39Mar 26, 2025Updated last year
- ☆14Nov 3, 2024Updated last year
- ☆42Jun 15, 2024Updated last year
- ☆24Jul 25, 2024Updated last year
- Korean politics data for research and development.☆12Jun 21, 2016Updated 9 years ago
- Strips boilerplate from Project Gutenberg text files☆17Jul 28, 2021Updated 4 years ago
- Repository for contributions for Data Generation for Post-OCR correction of Cyrillic handwriting paper☆21Nov 27, 2023Updated 2 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 3 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Jun 26, 2023Updated 2 years ago
- Portal for the course "Economic Slack" at UCSC [ECON 221]☆38Dec 13, 2025Updated 4 months ago
- [ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data☆14Feb 26, 2025Updated last year
- This file maps a given list of company names to their proper website and also maps a give list of websites to the company name.☆15Nov 16, 2018Updated 7 years ago
- This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and appl…☆20Jun 5, 2025Updated 11 months ago
- ☆28Apr 1, 2026Updated last month
- Various packages in use at Nori☆13Apr 29, 2024Updated 2 years ago