gsarti / it5
Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" ๐ฎ๐น
โ30Updated 5 months ago
Related projects โ
Alternatives and complementary repositories for it5
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingโ65Updated 2 years ago
- A library to synthesize text datasets using Large Language Models (LLM)โ151Updated last year
- โ35Updated 2 years ago
- A french sequence to sequence pretrained modelโ57Updated 2 years ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.โ25Updated 2 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkโ80Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.โ151Updated 5 months ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyiโฆโ14Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.โ56Updated last year
- Efficiently find the best-suited language model (LM) for your NLP taskโ91Updated this week
- Explainable Zero-Shot Topic Extractionโ61Updated 3 months ago
- Tutorial to pretrain & fine-tune a ๐ค Flax T5 model on a TPUv3-8 with GCPโ58Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ92Updated last year
- A spaCy custom component that extracts and normalizes temporal expressionsโ52Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2โฆโ66Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"โ46Updated 2 years ago
- โ11Updated 3 years ago
- A python package for benchmarking interpretability techniques on Transformers.โ212Updated last month
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.โ85Updated last month
- German small and large versions of GPT2.โ20Updated 2 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.โ101Updated 2 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.โ75Updated 2 months ago
- โ21Updated 4 months ago
- Multi-task modelling extensions for huggingface transformersโ18Updated last year
- An extension package of ๐ค Datasets that provides support for executing arbitrary SQL queries on HF datasetsโ31Updated 9 months ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.โ59Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformersโ56Updated 5 months ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsโฆโ37Updated 2 years ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.โ57Updated 2 years ago
- TimeLMs: Diachronic Language Models from Twitterโ102Updated 8 months ago