alvenirai / punctfix
☆20Updated 7 months ago
Related projects: ⓘ
- A spaCy custom component that extracts and normalizes temporal expressions☆53Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆27Updated last year
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated 5 months ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 3 years ago
- An extension package of 🤗 Datasets that provides support for executing arbitrary SQL queries on HF datasets☆31Updated 7 months ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆56Updated last year
- German small and large versions of GPT2.☆19Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆149Updated 3 months ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆109Updated 2 years ago
- GlotLID: Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆85Updated 2 months ago
- A merged version of multiple open-source German speech datasets.☆30Updated 4 months ago
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆91Updated 7 months ago
- ☆41Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆65Updated last year
- Parse and convert numbers written in French, English or Spanish into their digit representation.☆100Updated last month
- A Streamlit component for annotating text by text selecting.☆39Updated 3 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆73Updated last week
- A tiny BERT for low-resource monolingual models☆28Updated 4 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆12Updated this week
- spaCy match and replace, maintaining conjugation☆34Updated last year
- A python package for deep multilingual punctuation prediction.☆87Updated 3 weeks ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆85Updated 2 months ago
- Gamma Agreement in Python☆43Updated 6 months ago
- This is a neural spell checker☆59Updated last year
- NTREX -- News Test References for MT Evaluation☆73Updated 3 months ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 6 months ago
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated last month