stanford-crfm / mistralLinks

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

☆577

Alternatives and similar repositories for mistral

Users that are interested in mistral are comparing it to the libraries listed below

Sorting:

google / seqio
Task-based datasets, preprocessing, and evaluation for sequence models.
☆589Updated 2 weeks ago
craffel / llm-seminar
Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)
☆313Updated 3 years ago
bigscience-workshop / t-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
☆462Updated 3 years ago
lucidrains / RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆876Updated 2 years ago
huggingface / large_language_model_training_playbook
An open collection of implementation tips, tricks and resources for training large language models
☆488Updated 2 years ago
allenai / tango
Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.
☆565Updated last year
inverse-scaling / prize
A prize for finding tasks that cause large language models to show inverse scaling
☆619Updated 2 years ago
microsoft / adaptive-testing
Find and fix bugs in natural language machine learning models using adaptive testing.
☆188Updated last year
CarperAI / cheese
Used for adaptive human in the loop evaluation of language and embedding models.
☆308Updated 2 years ago
IntelLabs / academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget" paper
☆316Updated 2 years ago
huggingface / olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
☆178Updated 2 years ago
google-research / cascades
Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…
☆215Updated 5 months ago
HazyResearch / ama_prompting
Ask Me Anything language model prompting
☆546Updated 2 years ago
r-three / t-few
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"
☆456Updated 2 years ago
r-three / git-theta
git extension for {collaborative, communal, continual} model development
☆216Updated last year
google / flaxformer
☆363Updated last year
bigscience-workshop / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆105Updated 2 years ago
zeno-ml / zeno-build
Build, evaluate, understand, and fix LLM-based apps
☆492Updated last year
inseq-team / inseq
Interpretability for sequence generation models 🐛 🔍
☆447Updated last month
GEM-benchmark / NL-Augmenter
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations
☆786Updated last year
mlfoundations / open_lm
A repository for research on medium sized language models.
☆520Updated 5 months ago
bigscience-workshop / bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
☆1,006Updated last year
huggingface / datablations
Scaling Data-Constrained Language Models
☆342Updated 5 months ago
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆258Updated 2 years ago
allenai / natural-instructions
Expanding natural instructions
☆1,024Updated last year
mega002 / lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
☆180Updated 3 years ago
facebookresearch / anli
Adversarial Natural Language Inference Benchmark
☆396Updated 3 years ago
google-deepmind / pg19
☆249Updated 5 years ago
google-research / prompt-tuning
Original Implementation of Prompt Tuning from Lester, et al, 2021
☆698Updated 8 months ago
facebookresearch / Mephisto
A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.
☆312Updated 11 months ago