A lexical normalizer for historical spelling variants using a transformer architecture.
☆10Mar 12, 2025Updated last year
Alternatives and similar repositories for transnormer
Users that are interested in transnormer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SFST/SMOR/DWDS-based German Morphology☆21Updated this week
- Multi Tier Annotation Search☆12May 13, 2024Updated last year
- FairCopy is a word processor for the humanities scholar.☆13Jan 26, 2026Updated last month
- This repository contain the implementation of DANIEL. (A fast Document Attention Network for Information Extraction and Labeling of handw…☆21Jan 12, 2026Updated 2 months ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Oct 31, 2025Updated 4 months ago
- 🇩🇪 Preprocess German texts to do some serious natural-language processing.☆12Dec 9, 2022Updated 3 years ago
- Web Content Extraction Benchmark☆22Dec 16, 2025Updated 3 months ago
- Check your modified Ground Truth files with visual support!☆10Jan 31, 2024Updated 2 years ago
- An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic a…☆18Nov 18, 2024Updated last year
- NLP-helper for OCR-ed pages in PAGE XML format☆10Dec 6, 2024Updated last year
- DM is an environment for the study and annotation of images and texts. It is a suite of tools, enabling scholars to gather and organize t…☆19Dec 10, 2018Updated 7 years ago
- This repository provides German documentation relating to the text recognition and transcription platform eScriptorium. The documentation…☆14Dec 6, 2025Updated 3 months ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Oct 18, 2024Updated last year
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Dec 17, 2021Updated 4 years ago
- SapiMouse - a new dataset for Mouse Dynamics☆21Dec 28, 2022Updated 3 years ago
- CERberus -- guardian against character errors☆29Feb 15, 2024Updated 2 years ago
- A documentation for FAIR GPT, a virtual RDM consultant☆15Oct 10, 2024Updated last year
- QualiAnon is a tool to support the anonymization of text data. It is developed by the Qualiservice research data center for the anonymiza…☆35Feb 16, 2026Updated last month
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆29Jan 19, 2025Updated last year
- shoco is a compressor for small text strings. [Not maintained].☆10Sep 4, 2019Updated 6 years ago
- Mechanistic understanding and validation of large AI models with SemanticLens☆51Dec 4, 2025Updated 3 months ago
- Harfbuzz bindings for Lua☆12Dec 9, 2025Updated 3 months ago
- Lua implementation of the Unicode Bidirectional Algorithm☆10Jul 27, 2017Updated 8 years ago
- Basic HTR concepts/modules to boost performance☆39Nov 30, 2024Updated last year
- A webapp for labour-time calculation.☆49Mar 16, 2026Updated last week
- Bits and pieces for the carpentries workshops☆17Feb 23, 2026Updated last month
- A curated list of awesome RDM resources for researchers and organisations☆30Mar 2, 2026Updated 3 weeks ago
- Code repository for the paper "Mission: Impossible Language Models."☆56Sep 25, 2025Updated 5 months ago
- Precise hotword listener on Tract and Rust☆12Aug 6, 2022Updated 3 years ago
- Static Huffman coding☆10Apr 3, 2017Updated 8 years ago
- Experimental text shaping in LuaTeX using Harfbuzz library☆10Jul 17, 2018Updated 7 years ago
- Prefetch sources from github for nix build tool☆84Jan 28, 2026Updated last month
- Translation of query languages to serialized KoralQuery protocol☆14Mar 9, 2026Updated 2 weeks ago
- A tiny graph database engine written in C☆10May 9, 2014Updated 11 years ago
- ☆13Apr 14, 2024Updated last year
- reddit search tool using the pushift.io API☆14Sep 17, 2024Updated last year
- ☆10Oct 2, 2021Updated 4 years ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- Distributed KV store using go-ds-crdt and libp2p☆12Nov 28, 2021Updated 4 years ago