alvenirai / punctfixLinks
☆23Updated last year
Alternatives and similar repositories for punctfix
Users that are interested in punctfix are comparing it to the libraries listed below
Sorting:
- ☆358Updated last year
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 3 years ago
- 📝An easy-to-use package to restore punctuation of the text.☆119Updated 2 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆111Updated 5 months ago
- ☆48Updated 2 years ago
- A merged version of multiple open-source German speech datasets.☆33Updated last year
- Triton backend for https://github.com/OpenNMT/CTranslate2☆36Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆32Updated 7 months ago
- ☆39Updated 3 years ago
- ☆311Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆98Updated 10 months ago
- This is a neural spelling checker☆67Updated 2 years ago
- ☆44Updated 2 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆35Updated 3 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Universal Romanizer that can convert any unicode script to roman (latin) script☆228Updated last year
- A python package for deep multilingual punctuation prediction.☆136Updated last year
- Various speech datasets made available to the public☆130Updated 10 months ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated 2 years ago
- scipts for working with open.bible data☆25Updated 3 years ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆59Updated 11 months ago
- Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.☆74Updated last week
- Linguistic processing for Common Voice☆58Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆179Updated this week
- An open-source Python package for Danish speech recognition☆34Updated 2 years ago
- C++ inference engine for running GLiNER (Generalist and Lightweight Named Entity Recognition) models☆40Updated 10 months ago