alvenirai / punctfixLinks
☆22Updated last year
Alternatives and similar repositories for punctfix
Users that are interested in punctfix are comparing it to the libraries listed below
Sorting:
- ☆359Updated last year
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 3 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆108Updated 3 months ago
- 📝An easy-to-use package to restore punctuation of the text.☆119Updated 2 years ago
- A merged version of multiple open-source German speech datasets.☆33Updated last year
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- A model that predicts the punctuation of English, Italian, French and German texts.☆80Updated 2 years ago
- ☆307Updated last year
- A python package for deep multilingual punctuation prediction.☆130Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆97Updated 8 months ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆36Updated 2 years ago
- This is a neural spelling checker☆67Updated 2 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆149Updated 2 months ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- Universal Romanizer that can convert any unicode script to roman (latin) script☆219Updated last year
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆36Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆340Updated last year
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated last month
- Various speech datasets made available to the public☆128Updated 8 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆68Updated 2 years ago
- Generalist and Lightweight Model for Text Classification☆156Updated 2 months ago
- Library for fast text representation and classification.☆31Updated last year
- ☆109Updated 8 months ago
- A PyPI package for fast word/character error rate (WER/CER) calculation☆72Updated 2 years ago
- Simply, faster, sentence-transformers☆143Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆31Updated 6 months ago
- ☆38Updated 3 years ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆173Updated last week