CoderPat / croissant-llm-trainingLinks
Repository containing the code for training the CroissantLLM
☆21Updated last year
Alternatives and similar repositories for croissant-llm-training
Users that are interested in croissant-llm-training are comparing it to the libraries listed below
Sorting:
- The robust European language model benchmark.☆145Updated this week
- Repository for the EM German Model☆112Updated 2 years ago
- Page de préconfiguration de la communauté OpenLLM-France☆49Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆13Updated last year
- Late Interaction Models Training & Retrieval☆670Updated this week
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆176Updated last month
- Let's build better datasets, together!☆265Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆194Updated 6 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆64Updated last year
- A Scandinavian Benchmark for sentence embeddings☆44Updated 2 weeks ago
- ☆39Updated 2 years ago
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆56Updated this week
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 4 months ago
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆500Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆293Updated 9 months ago
- A library for working with prompt templates locally or on the Hugging Face Hub.☆51Updated 9 months ago
- Easily embed, cluster and semantically label text datasets☆584Updated last year
- Code for the MTEB Arena☆24Updated 5 months ago
- French instruction-following and chat models☆506Updated last year
- FENICE (Factuality Evaluation of Summarization based on Natural Language Inference and Claim Extraction) is a factuality-oriented metric …☆28Updated last year
- 🤗 Benchmark Large Language Models Reliably On Your Data☆419Updated this week
- code for training & evaluating Contextual Document Embedding models☆201Updated 7 months ago
- Official inference library for pre-processing of Mistral models☆830Updated last week
- My personal site☆79Updated last year
- ☆138Updated 4 months ago
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆67Updated 2 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆278Updated last year
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆66Updated 2 weeks ago
- Interpretability for sequence generation models 🐛 🔍☆450Updated 2 weeks ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆137Updated 11 months ago