CoderPat / croissant-llm-training
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- Page de préconfiguration de la communauté OpenLLM-France☆46Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆59Updated 9 months ago
- The robust European language model benchmark.☆101Updated this week
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- A repository of instructions in French to fine-tune LLMs☆17Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆172Updated last week
- A library for working with prompt templates locally or on the Hugging Face Hub.☆45Updated 2 months ago
- ☆90Updated 5 months ago
- Repository for the EM German Model☆110Updated last year
- A repository containing the code for translating popular LLM benchmarks to German.☆25Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆131Updated 5 months ago
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆20Updated last week
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆43Updated 8 months ago
- German Alpaca Dataset (Cleaned + Translated)☆24Updated 2 years ago
- Backend ressources for Albert. Albert is a conversational agent that uses official French data sources to answer administrative agents qu…☆121Updated last month
- 🌱 EcoLogits tracks the energy consumption and environmental footprint of using generative AI models through APIs.☆166Updated this week
- ☆67Updated last year
- ☆113Updated 5 months ago
- Robust and fast topic models with sentence-transformers.☆48Updated last week
- Let's build better datasets, together!☆259Updated 4 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆121Updated 2 weeks ago
- Evaluate language models using multiple choice items☆13Updated this week
- FENICE (Factuality Evaluation of Summarization based on Natural Language Inference and Claim Extraction) is a factuality-oriented metric …☆24Updated 5 months ago
- ☆45Updated 3 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- Late Interaction Models Training & Retrieval☆328Updated this week
- A french sequence to sequence pretrained model☆60Updated 2 years ago
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆54Updated last month
- A BERT-based application for reusable text classification at scale☆38Updated last year
- Easily embed, cluster and semantically label text datasets☆534Updated last year