gsarti / it5
Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" ๐ฎ๐น
โ30Updated 4 months ago
Related projects โ
Alternatives and complementary repositories for it5
- โ35Updated 2 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.โ85Updated last month
- ๐ ๏ธ Tools for Transformers compression using PyTorch Lightning โกโ79Updated this week
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.โ25Updated last month
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ92Updated last year
- ๐ค Disaggregators: Curated data labelers for in-depth analysis.โ65Updated last year
- Tutorial to pretrain & fine-tune a ๐ค Flax T5 model on a TPUv3-8 with GCPโ58Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.โ151Updated 5 months ago
- Explainable Zero-Shot Topic Extractionโ61Updated 2 months ago
- German small and large versions of GPT2.โ20Updated 2 years ago
- Bi-encoder entity linking architectureโ42Updated 2 months ago
- Google's BigBird (Jax/Flax & PyTorch) @ ๐คTransformersโ47Updated last year
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsโ31Updated 3 years ago
- A Python library aimed at dissecting and augmenting NER training data.โ56Updated last year
- A french sequence to sequence pretrained modelโ57Updated 2 years ago
- โ22Updated 2 years ago
- A library to synthesize text datasets using Large Language Models (LLM)โ151Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"โ46Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsโ52Updated last year
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingโ65Updated 2 years ago
- Annotated corpus + evaluation metrics for text anonymisationโ50Updated 9 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkโ80Updated 2 years ago
- Knowledge pills on Neural Searchโ25Updated last year
- โ15Updated 3 years ago
- [EMNLP-Findings 2020] Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentencesโ62Updated 5 months ago
- Semantically Structured Sentence Embeddingsโ67Updated 3 weeks ago
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.โ126Updated 3 years ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"โ28Updated 3 weeks ago
- A python package for benchmarking interpretability techniques on Transformers.โ211Updated last month
- โ37Updated 10 months ago