alvenirai / punctfix
β22Updated last year
Alternatives and similar repositories for punctfix:
Users that are interested in punctfix are comparing it to the libraries listed below
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub π€β‘οΈβ36Updated 2 years ago
- DaCy: The State of the Art Danish NLP pipeline using SpaCyβ95Updated last month
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecodeβ111Updated 2 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.β12Updated 2 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.β81Updated last year
- β38Updated 3 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsβ54Updated 2 years ago
- The Gridspace-Stanford Harper Valley speech dataset. Created in support of CS224S.β43Updated 3 years ago
- A merged version of multiple open-source German speech datasets.β31Updated 9 months ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to iβ¦β46Updated 10 months ago
- A tiny BERT for low-resource monolingual modelsβ31Updated 4 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.β27Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.β151Updated 8 months ago
- Various speech datasets made available to the publicβ113Updated 2 months ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.β54Updated 2 months ago
- scipts for working with open.bible dataβ24Updated 3 years ago
- Triton backend for https://github.com/OpenNMT/CTranslate2β34Updated last year
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.β117Updated 10 months ago
- A model that predicts the punctuation of English, Italian, French and German texts.β78Updated last year
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsβ31Updated 3 years ago
- Linguistic processing for Common Voiceβ53Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decodingβ74Updated 3 years ago
- MaSS - Multilingual corpus of Sentence-aligned Spoken utterancesβ49Updated 5 months ago
- Whisper fine-tuning event script to use multiple hf datasetsβ32Updated 2 years ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyiβ¦β14Updated 2 years ago
- πAn easy-to-use package to restore punctuation of the text.β112Updated last year
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated 11 months ago
- β43Updated 2 years ago
- Generalist and Lightweight Model for Text Classificationβ79Updated this week