CoderPat / croissant-llm-trainingLinks
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- The robust European language model benchmark.☆136Updated this week
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆52Updated this week
- Repository for the EM German Model☆112Updated 2 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆62Updated last year
- A library for working with prompt templates locally or on the Hugging Face Hub.☆50Updated 8 months ago
- French instruction-following and chat models☆506Updated 11 months ago
- Let's build better datasets, together!☆264Updated 11 months ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆171Updated 5 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆411Updated this week
- ☆690Updated 6 months ago
- ☆138Updated 3 months ago
- A repository containing the code for translating popular LLM benchmarks to German.☆30Updated 2 years ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆291Updated 8 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆191Updated 6 months ago
- Page de préconfiguration de la communauté OpenLLM-France☆49Updated last year
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- Automatically evaluate your LLMs in Google Colab☆670Updated last year
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆48Updated last year
- Enhancing Translation with RAG-Powered Large Language Models☆84Updated 2 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 3 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆188Updated 4 months ago
- Late Interaction Models Training & Retrieval☆652Updated last week
- Manage scalable open LLM inference endpoints in Slurm clusters☆277Updated last year
- code for training & evaluating Contextual Document Embedding models☆200Updated 6 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆213Updated 2 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆123Updated 3 weeks ago
- German Alpaca Dataset (Cleaned + Translated)☆26Updated 2 years ago
- Low-Rank adapter extraction for fine-tuned transformers models☆179Updated last year
- Easily embed, cluster and semantically label text datasets☆584Updated last year