ad-freiburg / whitespace-correction
Fast whitespace correction with Transformers
☆14Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for whitespace-correction
- Vietnamese Punctuation Prediction using Pretrained Language Models☆13Updated 2 years ago
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆20Updated 4 months ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Updated 3 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆12Updated last year
- Library for pruning experts per language pair in NLLB-200☆27Updated last year
- A tiny BERT for low-resource monolingual models☆29Updated last month
- Transformation spoken text to written text☆28Updated 6 months ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆42Updated 3 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 9 months ago
- Repository containing the open source code of works published at the FBK MT unit.☆42Updated 4 months ago
- End-to-End Vietnamese Speech Recognition using wav2vec 2.0☆93Updated 3 years ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆91Updated last year
- Correction of spaces with character-based neural language models.☆13Updated 2 years ago
- zero-vocab or low-vocab embeddings☆17Updated 2 years ago
- Repository for Findings of EMNLP 2020 "Context-aware Stand-alone Neural Spelling Correction"☆18Updated 3 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆48Updated 2 months ago
- Vi_G2P or ViG2P: G2P package for Vietnamese: based on vPhon and phonology knowledge to convert Raw text - Graphoneme to IPA☆71Updated 5 months ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated last year
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM☆33Updated last year
- one script for xls-r/xlsr/whisper fine-tuning☆39Updated last year
- ☆33Updated 3 years ago
- Solution for Zalo AI Challenge 2022 - Lyrics Alignment☆67Updated last year
- asr2k☆48Updated 5 months ago
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆19Updated 2 years ago
- ☆40Updated last year
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆24Updated 3 years ago
- Collection of scripts from mHuBERT-147.☆22Updated this week
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆21Updated 2 years ago
- Showcasing various NLP Downstream tasks Training with pre-trained Language models using Pytorch Lightning☆12Updated 2 years ago