teelinsan / camoscio
Camoscio: An Italian instruction-tuned language model based on LLaMA
☆126Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for camoscio
- Get ready to meet Fauno - the Italian language model crafted by the RSTLess Research Group from the Sapienza University of Rome.☆79Updated last year
- ☆37Updated 10 months ago
- The home of Stambecco 🦌: Italian Instruction-following LLaMA Model☆20Updated last year
- Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" 🇮🇹☆30Updated 4 months ago
- 🦄 Unitxt: a python library for getting data fired up and set for training and evaluation☆159Updated this week
- The 🌟ANITA project🌟 *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an im…☆13Updated last month
- Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.☆314Updated 2 months ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated last year
- 🇮🇹 Italian BERT and ELECTRA models (incl. evaluation)☆17Updated 2 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆211Updated last month
- A Word Level Transformer layer based on PyTorch and 🤗 Transformers.☆34Updated 9 months ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆25Updated last month
- ☆11Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆184Updated last month
- UmBERTo: an Italian Language Model trained with Whole Word Masking.☆103Updated last year
- German Alpaca Dataset (Cleaned + Translated)☆23Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆52Updated 3 months ago
- Let's build better datasets, together!☆202Updated 3 months ago
- Late Interaction Models Training & Retrieval☆158Updated last week
- Knowledge pills on Neural Search☆25Updated last year
- Fact checking baseline combining dense retrieval and textual entailment☆28Updated 10 months ago
- ☆115Updated last week
- A collection of Italian benchmarks for LLM evaluation☆20Updated last week
- MAFAND-MT☆53Updated 4 months ago
- Pipeline for pulling and processing online language model pretraining data from the web☆174Updated last year
- RaKUn 2.0 - A fast keyword detection algorithm☆64Updated 2 months ago
- A Python package for analyzing and transforming neural latent spaces.☆41Updated 2 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆92Updated last year
- Simple Annotated implementation of GPT-NeoX in PyTorch☆111Updated 2 years ago
- Sentiment analysis and emotion classification for Italian using BERT (fine-tuning). Published at the WASSA workshop (EACL2021).☆24Updated 4 months ago