Shef-AIRE/llms_post-ocr_correction

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Shef-AIRE/llms_post-ocr_correction)

Shef-AIRE / llms_post-ocr_correction

Leveraging LLMs for Post-OCR Correction of Historical Newspapers

☆18

Alternatives and similar repositories for llms_post-ocr_correction

Users that are interested in llms_post-ocr_correction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

transferwise / wise-topic
View on GitHub
LLM-only topic extraction and classification
☆11Jun 3, 2026Updated last month
pmonta / CD-decoder
View on GitHub
Decodes Compact Disc data from microscope images of a CD's surface
☆12Jan 14, 2023Updated 3 years ago
davanstrien / flyswot
View on GitHub
Command Line Interface for running 🤗 Transformers Image Classification locally
☆19Jul 15, 2026Updated last week
kingsdigitallab / kdl-vqa
View on GitHub
Python tool for batch visual question answering (BVQA).
☆14Sep 18, 2025Updated 10 months ago
kmike / dialog2017
View on GitHub
☆10Jul 21, 2017Updated 9 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
badrex / rdf2text
View on GitHub
Generating text from RDF data with sequence to sequence models
☆11Jul 25, 2018Updated 7 years ago
mixedbread-ai / wiki_demo_app
View on GitHub
☆14Jun 25, 2024Updated 2 years ago
mixedbread-ai / binary-embeddings
View on GitHub
Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…
☆19Mar 23, 2024Updated 2 years ago
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
Collection-Space-Navigator / CSN
View on GitHub
Interactive Visualization Interface for Multidimensional Datasets
☆68Nov 11, 2025Updated 8 months ago
ltgoslo / simple_elmo_training
View on GitHub
Minimal code to train ELMo models in recent versions of TensorFlow
☆14Jun 16, 2026Updated last month
Living-with-machines / alto2txt
View on GitHub
Convert ALTO XML to plain text + minimal metadata
☆17Oct 17, 2024Updated last year
SapienzaNLP / clubert
View on GitHub
Distribution of word meanings in Wikipedia for English, Italian, French, German and Spanish.
☆10Jan 4, 2021Updated 5 years ago
timarkh / uniparser-grammar-udm
View on GitHub
Morphological analysis for Udmurt.
☆12May 23, 2026Updated 2 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
antonisa / embeddings
View on GitHub
Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages
☆15Apr 11, 2020Updated 6 years ago
anklowait / python_for_CL
View on GitHub
материалы курса по питону для студентов дпо-программы "компьютерная лингвистика" в НИУ ВШЭ (2020-2021)
☆12Feb 21, 2022Updated 4 years ago
vmkhlv / hse_compling_and_it
View on GitHub
Материалы курса "Компьютерная лингвистика и информационные технологии" для 4-го курса бакалавриата направления "Фундаментальная и приклад…
☆10Mar 25, 2021Updated 5 years ago
clab / cnn-v1
View on GitHub
Legacy version of CNN neural net toolkit (now called dynet)
☆19Oct 8, 2016Updated 9 years ago
cisocrgroup / OCR-Workshop
View on GitHub
Presentations, tutorials and data for the OCR workshop at LMU
☆16Jun 2, 2017Updated 9 years ago
LUMII-AILab / FullStack
View on GitHub
Full Stack of Latvian Language Resources for Natural Language Understanding (NLU) and Generation (NLG)
☆16Oct 20, 2022Updated 3 years ago
OSU-slatelab / seq_tagger
View on GitHub
Sequence Tagging with Cross-Lingual Transfer Learning
☆16Jul 30, 2017Updated 8 years ago
softcite / software-mentions
View on GitHub
Softcite software mention recognizer, finding mentions and citations to software from within the academic literature
☆85Jun 6, 2026Updated last month
xavierfav / feature-comparison-clustering
View on GitHub
Comparing Audio Features for Unsupervised Sound Classification
☆10Jun 22, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
karlstratos / minitagger
View on GitHub
☆21Apr 4, 2015Updated 11 years ago
veraPDF / veraPDF-parser
View on GitHub
veraPDF PDF parser
☆36Updated this week
cproctor / qualitative-coding
View on GitHub
Qualitative coding for computer scientists
☆27Jun 25, 2026Updated 3 weeks ago
UniversalDependencies / UD_Cantonese-HK
View on GitHub
Spoken Cantonese from Hong Kong.
☆30May 6, 2026Updated 2 months ago
manymuch / Natural-Noise-Generator
View on GitHub
☆10Aug 3, 2019Updated 6 years ago
rhasspy / wav2mel
View on GitHub
Transform audio files into mel spectrograms for text-to-speech model training
☆12Aug 25, 2021Updated 4 years ago
ai-forever / combined_solution_aij2019
View on GitHub
AI Journey 2019: Combined Solution
☆15Dec 8, 2022Updated 3 years ago
GPUPhobia / vocal-mask
View on GitHub
☆12May 1, 2019Updated 7 years ago
david-gimeno / tailored-avsr
View on GitHub
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
☆15Feb 24, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
czcorpus / InterText_editor
View on GitHub
Editor for aligned parallel texts (personal desktop application).
☆20Jan 15, 2026Updated 6 months ago
muhdhuz / audio2spec
View on GitHub
Scripts to convert audio files to spectrograms and back
☆12Nov 23, 2017Updated 8 years ago
zassou65535 / WaveGAN
View on GitHub
WaveGANによる音声生成器
☆13Feb 9, 2024Updated 2 years ago
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
unreal79 / pic2wav
View on GitHub
Encode an image to sound (WAV file) and view it as a spectrogram. Optimized Python 3 version.
☆11Jan 25, 2023Updated 3 years ago
d3n7 / riffusionPrepper
View on GitHub
Prepare spectrograms from audio for training a Riffusion model
☆16Mar 6, 2023Updated 3 years ago