coastalcph / histnormView external linksLinks
Compiled tools, datasets, and other resources for historical text normalization.
☆20Jun 18, 2019Updated 6 years ago
Alternatives and similar repositories for histnorm
Users that are interested in histnorm are comparing it to the libraries listed below
Sorting:
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Updated this week
- The website of the Oscar Project☆11Mar 27, 2025Updated 10 months ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆24Oct 27, 2023Updated 2 years ago
- Data for the HIPE 2022 shared task.☆21Nov 29, 2023Updated 2 years ago
- CERberus -- guardian against character errors☆29Feb 15, 2024Updated last year
- Finite-state script normalization and processing utilities☆46Jan 14, 2026Updated last month
- ☆32Sep 27, 2021Updated 4 years ago
- Libraries, Archives and Museums (LAM)☆88Oct 4, 2022Updated 3 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 6 months ago
- ☆10Feb 2, 2021Updated 5 years ago
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- PowerShell scripts for processing content into CONTENTdm load packages, batch editing, and batch re-ocr.☆11Jun 2, 2023Updated 2 years ago
- Linguistic Reconstruction with LingPy☆15Aug 5, 2024Updated last year
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues (NLP4IF, EMNLP-IJCNLP 2019)☆11Dec 21, 2020Updated 5 years ago
- ☆10Oct 2, 2024Updated last year
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- Creating crowdsourcing based experiments made easy☆10May 25, 2020Updated 5 years ago
- Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.☆11Aug 14, 2019Updated 6 years ago
- MATLAB code for Stein Point Markov Chain Monte Carlo.☆13Jul 3, 2019Updated 6 years ago
- decontamination☆24Dec 3, 2025Updated 2 months ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- ☆10Sep 13, 2022Updated 3 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- Latin texts annotated for named entities and NER tagger used for the Herodotos Project (Ohio State University / Ghent University)☆11Sep 26, 2022Updated 3 years ago
- Wikipedia Citations in Wikidata☆10May 6, 2021Updated 4 years ago
- Extension for pie to include taggers with their models and pre/postprocessors☆11May 30, 2024Updated last year
- 0-Shot Tokenizer Transplant☆14May 16, 2025Updated 8 months ago
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆13Mar 2, 2024Updated last year
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- This repository provides the source code used to automatically generate the book summarization datasets described in the paper titled "Ec…☆11Apr 14, 2025Updated 10 months ago
- Poetry Corpora Annotated on Aesthetic Emotions☆12Aug 2, 2022Updated 3 years ago
- Collection of description of concepts, procedures, and simple XSLT files for text processing, e.g. simplify InDesign documents (.idml) to…☆12Jan 9, 2020Updated 6 years ago
- Codebase accompanying the paper 'Widening the Representation Bottleneck in Neural Machine Translation with Lexical Shortcuts', (Emelin, D…☆11Feb 14, 2023Updated 3 years ago
- Word embeddings from PPMI-weighted and dirichlet-smoothed co-occurrence matrices☆10Aug 3, 2020Updated 5 years ago
- ☆45Sep 26, 2021Updated 4 years ago
- Python tools for performing various operations on ALTO XML files☆48Feb 27, 2025Updated 11 months ago
- ☆52Jun 6, 2023Updated 2 years ago