alvenirai / punctfixLinks
☆22Updated last year
Alternatives and similar repositories for punctfix
Users that are interested in punctfix are comparing it to the libraries listed below
Sorting:
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 2 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆36Updated 2 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated last month
- ☆359Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- 📝An easy-to-use package to restore punctuation of the text.☆116Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆36Updated 3 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆79Updated last year
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆30Updated 4 months ago
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆96Updated 6 months ago
- ☆47Updated 2 years ago
- ☆38Updated 3 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 9 months ago
- ☆76Updated 3 years ago
- ☆56Updated 2 years ago
- Audio feature extraction and baseline search implementation for the Spotify Podcast Dataset.☆12Updated 3 years ago
- A merged version of multiple open-source German speech datasets.☆31Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Using short models to classify long texts☆21Updated 2 years ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated 2 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆82Updated 2 years ago
- Gamma Agreement in Python☆44Updated last year
- Scripts to create speech corpora from open.bible☆13Updated 3 years ago