CoderPat / croissant-llm-trainingLinks
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆37Updated this week
- The robust European language model benchmark.☆106Updated this week
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆59Updated 10 months ago
- Page de préconfiguration de la communauté OpenLLM-France☆46Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆138Updated 3 weeks ago
- Let's build better datasets, together!☆260Updated 6 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆207Updated last month
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆177Updated last month
- Backend ressources for Albert. Albert is a conversational agent that uses official French data sources to answer administrative agents qu…☆121Updated 2 months ago
- A repository of instructions in French to fine-tune LLMs☆17Updated 2 years ago
- German Alpaca Dataset (Cleaned + Translated)☆25Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- A french sequence to sequence pretrained model☆61Updated 2 years ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆261Updated 11 months ago
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆42Updated 9 months ago
- A repository containing the code for translating popular LLM benchmarks to German.☆25Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆184Updated 5 months ago
- Official implementation of "GPT or BERT: why not both?"☆53Updated 2 weeks ago
- A library for working with prompt templates locally or on the Hugging Face Hub.☆46Updated 3 months ago
- An introduction to LLM Sampling☆78Updated 6 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- ☆67Updated last year
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆65Updated 4 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 7 months ago
- Enhancing Translation with RAG-Powered Large Language Models☆80Updated 3 months ago
- ☆126Updated last week
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆137Updated last month
- Efficiently find the best-suited language model (LM) for your NLP task☆124Updated 3 weeks ago
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆56Updated 2 months ago
- Generalist and Lightweight Model for Text Classification☆134Updated last week