Correction of spaces with character-based neural language models.
☆13Aug 23, 2022Updated 3 years ago
Alternatives and similar repositories for tokenization-repair
Users that are interested in tokenization-repair are comparing it to the libraries listed below
Sorting:
- Repository for Findings of EMNLP 2020 "Context-aware Stand-alone Neural Spelling Correction"☆18Dec 21, 2020Updated 5 years ago
- OCR post processing and spelling correction.☆11Nov 12, 2018Updated 7 years ago
- Fast whitespace correction with Transformers☆17Aug 22, 2025Updated 6 months ago
- BERT-based GEC tagging for Japanese☆19Aug 4, 2023Updated 2 years ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆17Jul 16, 2024Updated last year
- Python 3 library for processing historical English☆68Aug 10, 2024Updated last year
- Make the bot work so you don't have to learn a language.☆10Jun 23, 2021Updated 4 years ago
- Writing Observer and Learning Observer: A system for monitoring learning process data, with an initial focus on writing process data from…☆12Feb 28, 2026Updated last week
- 结合截图生成干净的百度热力图☆17Jun 24, 2023Updated 2 years ago
- Convolutional Neural Network (CNN) was trained on 48x48 pixel grayscale images to predict 5 different emotions from images. Ten different…☆11Sep 21, 2022Updated 3 years ago
- german sentiment analysis☆13Mar 8, 2017Updated 9 years ago
- A telegram bot to track apartment offers based on some criteria☆10Aug 29, 2022Updated 3 years ago
- ☆10Jul 6, 2023Updated 2 years ago
- ☆10Mar 5, 2024Updated 2 years ago
- Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phras…☆11Dec 13, 2018Updated 7 years ago
- Automatic Detection of Potentially Idiomatic Expressions☆12Feb 19, 2021Updated 5 years ago
- BlackArch configuration for the bash shell.☆13Jan 11, 2021Updated 5 years ago
- Presets, styles & icons for our various JOSM tools.☆14Jun 14, 2023Updated 2 years ago
- ☆12Sep 18, 2025Updated 5 months ago
- Simple tool to fetch the changelog of packages from the rpm repositories☆10Aug 30, 2024Updated last year
- ☆11Mar 31, 2023Updated 2 years ago
- Suffices of German town and village names☆10May 4, 2020Updated 5 years ago
- Python test doubles library☆12Oct 11, 2024Updated last year
- Dotfiles and dotfile accessories☆18May 7, 2021Updated 4 years ago
- Generate letters (plain text or PDF) from templates.☆14Jan 8, 2023Updated 3 years ago
- ☆13Apr 24, 2023Updated 2 years ago
- ☆12Jun 29, 2025Updated 8 months ago
- Dewey Data Inc. Python API☆14Jul 2, 2025Updated 8 months ago
- Getting human perception scores from street-level imagery☆22Jul 17, 2024Updated last year
- Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.☆37Apr 6, 2023Updated 2 years ago
- Replication materials for "Identifying the Development and Application of Artificial Intelligence in Scientific Text"☆13Feb 18, 2020Updated 6 years ago
- Lossless normalization of uppercase characters☆11Jul 3, 2023Updated 2 years ago
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated 10 months ago
- A search engine implementation using OpenAI's clip model☆10Jun 20, 2021Updated 4 years ago
- ☆11Sep 8, 2017Updated 8 years ago
- ☆11Nov 14, 2021Updated 4 years ago
- A french litbank corpus☆10Jan 22, 2026Updated last month
- Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks☆17Feb 2, 2026Updated last month
- data & analyze data from Citi Bike's GBFS real-time data feed☆11Mar 26, 2024Updated last year