edupoux / MVA_2022_SLLinks
☆7Updated 2 years ago
Alternatives and similar repositories for MVA_2022_SL
Users that are interested in MVA_2022_SL are comparing it to the libraries listed below
Sorting:
- Repository contains code to fine-tune WhisperASR model☆23Updated 2 years ago
- ☆359Updated last year
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆280Updated 5 months ago
- German Alpaca Dataset (Cleaned + Translated)☆25Updated 2 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆138Updated 3 weeks ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆94Updated this week
- ☆104Updated last month
- Library for pruning experts per language pair in NLLB-200☆33Updated last year
- MAFAND-MT☆56Updated 11 months ago
- A french sequence to sequence pretrained model☆61Updated 2 years ago
- ☆296Updated last year
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆137Updated last month
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆110Updated last month
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆82Updated 9 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆158Updated last year
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆32Updated 3 weeks ago
- ☆124Updated 8 months ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 2 years ago
- ☆99Updated 6 months ago
- A repository containing the code for translating popular LLM benchmarks to German.☆25Updated last year
- HF's ML for Audio study group☆192Updated 2 years ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆35Updated last year
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Updated last year
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆101Updated last year
- Universal Romanizer that can convert any unicode script to roman (latin) script☆210Updated 11 months ago
- Various speech datasets made available to the public☆122Updated 6 months ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆36Updated 3 years ago
- Finetune VITS and MMS using HuggingFace's tools☆156Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆178Updated last year