alvenirai / punctfixLinks
☆23Updated last year
Alternatives and similar repositories for punctfix
Users that are interested in punctfix are comparing it to the libraries listed below
Sorting:
- ☆359Updated last year
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆108Updated 3 months ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 3 years ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- 📝An easy-to-use package to restore punctuation of the text.☆118Updated 2 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- This is a neural spelling checker☆67Updated 2 years ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆79Updated 2 years ago
- Universal Romanizer that can convert any unicode script to roman (latin) script☆225Updated last year
- Triton backend for https://github.com/OpenNMT/CTranslate2☆35Updated 2 years ago
- ☆39Updated 3 years ago
- ☆47Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆35Updated 3 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Confection: the sweetest config system for Python☆190Updated 5 months ago
- A python package for finding words that sound like other words. Useful for entity resolution and poetry, among other things.☆14Updated 2 years ago
- A merged version of multiple open-source German speech datasets.☆33Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆98Updated 9 months ago
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆32Updated 6 months ago
- A tiny BERT for low-resource monolingual models☆31Updated 11 months ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆59Updated 9 months ago
- Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.☆73Updated 2 weeks ago
- ☆308Updated last year
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated 2 months ago
- A python package for deep multilingual punctuation prediction.☆131Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆68Updated 2 years ago
- Bicleaner fork that uses neural networks☆40Updated 3 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆176Updated this week
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated 2 years ago
- scipts for working with open.bible data☆25Updated 3 years ago