ad-freiburg/tokenization-repair

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ad-freiburg/tokenization-repair)

ad-freiburg / tokenization-repair

Correction of spaces with character-based neural language models.

☆13

Alternatives and similar repositories for tokenization-repair

Users that are interested in tokenization-repair are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jacklxc / StandAloneSpellingCorrection
View on GitHub
Repository for Findings of EMNLP 2020 "Context-aware Stand-alone Neural Spelling Correction"
☆18Dec 21, 2020Updated 5 years ago
ad-freiburg / whitespace-correction
View on GitHub
Fast whitespace correction with Transformers
☆18Aug 22, 2025Updated 11 months ago
mpsilfve / ocrpp
View on GitHub
OCR post processing and spelling correction.
☆11Nov 12, 2018Updated 7 years ago
KIT-IAI / Transformer-Networks-for-Electrical-Load-Time-Series-Forecasting
View on GitHub
☆15Sep 19, 2023Updated 2 years ago
laituan245 / EL-Dockers
View on GitHub
☆25Jul 15, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
adrianeboyd / boyd-wnut2018
View on GitHub
Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)
☆17Jul 16, 2024Updated 2 years ago
lancopku / Augmented_Data_for_FST
View on GitHub
The augmented data of the paper "Parallel Data Augmentation for Formality Style Transfer" (ACL 2020).
☆12May 14, 2020Updated 6 years ago
Abhay64 / Bollinger-Bands
View on GitHub
Bollinger Bands shows the levels of different highs and lows that a security price has reached in a particular duration.
☆10Apr 18, 2018Updated 8 years ago
jonnyli1125 / gector-ja
View on GitHub
BERT-based GEC tagging for Japanese
☆19Aug 4, 2023Updated 2 years ago
trungfinity / visc
View on GitHub
Vietnamese spelling correction (ViSC) tool
☆12Dec 11, 2016Updated 9 years ago
mdrakiburrahman / azure-databricks-malware-prediction
View on GitHub
End-to-end Machine Learning Pipeline demo using Delta Lake, MLflow and AzureML in Azure Databricks
☆18Nov 9, 2019Updated 6 years ago
cnap / smt-for-gec
View on GitHub
☆12Sep 8, 2017Updated 8 years ago
tomo-wb / Lang8-NAIST-extractor
View on GitHub
☆30May 8, 2020Updated 6 years ago
logeekal / 30-days-of-Vanilla-JS
View on GitHub
This repository has 30 mini project ideas (approx 2 hours each) that I will coding everyday.
☆17Nov 6, 2019Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Katsumata420 / generic-pretrained-GEC
View on GitHub
Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.
☆37Apr 6, 2023Updated 3 years ago
vovanphuc / SimeCSE_Vietnamese
View on GitHub
SimeCSE_Vietnamese: Simple Contrastive Learning of Sentence Embeddings with Vietnamese
☆20May 28, 2021Updated 5 years ago
UniversalDependencies / UD_Vietnamese-VTB
View on GitHub
☆38May 6, 2026Updated 2 months ago
georgetown-cset / ai-relevant-papers
View on GitHub
Replication materials for "Identifying the Development and Application of Artificial Intelligence in Scientific Text"
☆14Feb 18, 2020Updated 6 years ago
mhmzdev / BFS-Romania-Map-Problem
View on GitHub
BFS Implementation of Romania Map Problem in Python
☆12Nov 9, 2020Updated 5 years ago
mikahama / natas
View on GitHub
Python 3 library for processing historical English
☆68Aug 10, 2024Updated last year
KIT-IAI / pyWATTS
View on GitHub
pyWATTS: Python Workflow Automation Tool for Time-Series
☆40Jun 22, 2024Updated 2 years ago
shamilcm / m2scorer
View on GitHub
Scorer for grammatical error correction systems.
☆14Feb 24, 2016Updated 10 years ago
lt3 / nfr
View on GitHub
Neural Fuzzy Repair (NFR) is a data augmentation pipeline, which integrates fuzzy matches (i.e. similar translations) into neural machine…
☆12Aug 14, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
hweidner / galera-docker
View on GitHub
A Dockerfile for MariaDB Galera cluster
☆20Oct 17, 2021Updated 4 years ago
dice-group / KG-NMT
View on GitHub
Knowledge Graph-augmented NMT
☆11Sep 20, 2021Updated 4 years ago
wabyking / word2fun
View on GitHub
☆11May 9, 2022Updated 4 years ago
hslh / pie-detection
View on GitHub
Automatic Detection of Potentially Idiomatic Expressions
☆12Feb 19, 2021Updated 5 years ago
adesgautam / clip-search
View on GitHub
A search engine implementation using OpenAI's clip model
☆10Jun 20, 2021Updated 5 years ago
RUCAIBox / MPOP
View on GitHub
☆13Jun 16, 2021Updated 5 years ago
amunategui / Read-and-Process-Files-Larger-Than-RAM
View on GitHub
Using the function read.table() to break file into chunks to loop and process them. This allows processing files of any size beyond what …
☆10Aug 19, 2014Updated 11 years ago
chiragjn / short-text-similarity
View on GitHub
Short Text Similarity as described in https://dl.acm.org/citation.cfm?id=2806475
☆17Feb 7, 2019Updated 7 years ago
syncdoth / Chain-of-Hindsight-PyTorch
View on GitHub
Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.
☆11Apr 5, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Aolin-MIR / soft-masked-bert-for-spelling-error-correction
View on GitHub
A third-party implementation of paper《Spelling Error Correction with Soft-Masked BERT》using tensorflow==1.12.0
☆22Nov 27, 2020Updated 5 years ago
FrankGrimm / node-germansentiment
View on GitHub
german sentiment analysis
☆13Mar 8, 2017Updated 9 years ago
chemicaltree / tetra
View on GitHub
☆10Sep 14, 2022Updated 3 years ago
maziarraissi / Introduction-to-Machine-Learning-in-R
View on GitHub
Introduction to Machine Learning in R
☆26May 7, 2021Updated 5 years ago
rgcottrell / pytorch-human-performance-gec
View on GitHub
A PyTorch implementation of "Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study"
☆50Dec 17, 2018Updated 7 years ago
shakeel608 / OpenNMT-py-with-BERT
View on GitHub
OpenNMT Pytorch with BERT Embeddings
☆24Sep 23, 2019Updated 6 years ago
Eajack / NLP-ML_CS-Cpp_Review
View on GitHub
NLP/ML面试各类资料链接汇总（主要Github收集）
☆11Mar 3, 2020Updated 6 years ago