CoderPat / croissant-llm-trainingLinks
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆46Updated last year
- The robust European language model benchmark.☆123Updated this week
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆42Updated this week
- French instruction-following and chat models☆507Updated 9 months ago
- Let's build better datasets, together!☆263Updated 9 months ago
- Repository for the EM German Model☆112Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆287Updated 6 months ago
- SpanMarker for Named Entity Recognition☆451Updated 8 months ago
- Late Interaction Models Training & Retrieval☆584Updated this week
- 🦖 X—LLM: Cutting Edge & Easy LLM Finetuning☆407Updated last year
- ☆135Updated last month
- Easily embed, cluster and semantically label text datasets☆573Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆212Updated this week
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆61Updated last year
- Efficiently find the best-suited language model (LM) for your NLP task☆127Updated last month
- 🤗 Benchmark Large Language Models Reliably On Your Data☆392Updated 2 weeks ago
- German Alpaca Dataset (Cleaned + Translated)☆26Updated 2 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆160Updated 3 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- ☆39Updated last year
- code for training & evaluating Contextual Document Embedding models☆197Updated 4 months ago
- ☆367Updated last year
- 🌱 EcoLogits tracks the energy consumption and environmental footprint of using generative AI models through APIs.☆222Updated this week
- Page de préconfiguration de la communauté OpenLLM-France☆47Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆273Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆186Updated 3 months ago
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆67Updated 2 months ago
- The website for Danish Foundation Models, a project for training foundational Danish language model.☆74Updated 3 weeks ago
- A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.☆356Updated 2 months ago