gsarti / it5
Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" ๐ฎ๐น
โ30Updated 11 months ago
Alternatives and similar repositories for it5
Users that are interested in it5 are comparing it to the libraries listed below
Sorting:
- A Python library aimed at dissecting and augmenting NER training data.โ58Updated 2 years ago
- Explainable Zero-Shot Topic Extractionโ62Updated 8 months ago
- A library to synthesize text datasets using Large Language Models (LLM)โ152Updated 2 years ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingโ66Updated 3 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkโ80Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ93Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsโ54Updated 2 years ago
- Tutorial to pretrain & fine-tune a ๐ค Flax T5 model on a TPUv3-8 with GCPโ58Updated 2 years ago
- German small and large versions of GPT2.โ20Updated 3 years ago
- Semantically Structured Sentence Embeddingsโ66Updated 7 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.โ154Updated 11 months ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.โ87Updated last month
- โ43Updated 2 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsโฆโ36Updated 3 years ago
- MAFAND-MTโ55Updated 10 months ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.โ34Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality โฆโ106Updated last year
- TimeLMs: Diachronic Language Models from Twitterโ107Updated last year
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer modelsโ65Updated 2 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.โ102Updated 2 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.โ59Updated 2 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsโ31Updated 4 years ago
- A python package for benchmarking interpretability techniques on Transformers.โ211Updated 7 months ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"โ48Updated 2 years ago
- โ35Updated 3 years ago
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.โ126Updated 4 years ago
- Shared code for training sentence embeddings with Flax / JAXโ27Updated 3 years ago
- ๐ ๏ธ Tools for Transformers compression using PyTorch Lightning โกโ83Updated 6 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatioโฆโ44Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doโฆโ80Updated 10 months ago