stefan-it / italian-bertelectraLinks
🇮🇹 Italian BERT and ELECTRA models (incl. evaluation)
☆18Updated 3 years ago
Alternatives and similar repositories for italian-bertelectra
Users that are interested in italian-bertelectra are comparing it to the libraries listed below
Sorting:
- Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" 🇮🇹☆30Updated last year
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆57Updated 3 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆85Updated last year
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆28Updated last year
- Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data☆158Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated last month
- A large scale dataset for Question Answering in Italian☆27Updated 7 years ago
- UmBERTo: an Italian Language Model trained with Whole Word Masking.☆110Updated 2 years ago
- TimeLMs: Diachronic Language Models from Twitter☆111Updated last year
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 3 years ago
- Camoscio: An Italian instruction-tuned language model based on LLaMA☆127Updated last year
- LTG-Bert☆34Updated last year
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- German small and large versions of GPT2.☆20Updated 3 years ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆60Updated last year
- A survey of corpora for Germanic low-resource languages and dialects☆26Updated 11 months ago
- Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"☆200Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆41Updated 3 years ago
- zero shot NER fine tuning☆13Updated 8 months ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆154Updated 2 years ago
- This is a neural spelling checker☆68Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆34Updated 8 months ago
- ☆115Updated last month
- Some notebooks for NLP☆207Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆79Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆96Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 4 years ago