CoderPat / croissant-llm-trainingLinks
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- The robust European language model benchmark.☆133Updated this week
- Repository for the EM German Model☆112Updated last year
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆45Updated this week
- Page de préconfiguration de la communauté OpenLLM-France☆48Updated last year
- A library for working with prompt templates locally or on the Hugging Face Hub.☆50Updated 7 months ago
- Let's build better datasets, together!☆263Updated 10 months ago
- French instruction-following and chat models☆506Updated 10 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆62Updated last year
- A repository containing the code for translating popular LLM benchmarks to German.☆30Updated 2 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆163Updated 4 months ago
- ☆201Updated last week
- 🌱 EcoLogits tracks the energy consumption and environmental footprint of using generative AI models through APIs.☆225Updated last week
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- Code for collecting, processing, and preparing datasets for the Common Pile☆235Updated last month
- Easily embed, cluster and semantically label text datasets☆581Updated last year
- Late Interaction Models Training & Retrieval☆632Updated this week
- ☆136Updated 2 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆408Updated last month
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆188Updated 3 months ago
- A Scandinavian Benchmark for sentence embeddings☆41Updated 5 months ago
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆316Updated last year
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆494Updated last year
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆249Updated last year
- FENICE (Factuality Evaluation of Summarization based on Natural Language Inference and Claim Extraction) is a factuality-oriented metric …☆28Updated 11 months ago
- Guideline following Large Language Model for Information Extraction☆404Updated last year
- A compact LLM pretrained in 9 days by using high quality data☆332Updated 6 months ago
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆48Updated last year
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆211Updated this week
- ☆687Updated 6 months ago