stanford-crfm / mistralView external linksLinks
Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face π€ Transformers.
β577Nov 10, 2023Updated 2 years ago
Alternatives and similar repositories for mistral
Users that are interested in mistral are comparing it to the libraries listed below
Sorting:
- β22Aug 31, 2021Updated 4 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixingβ49Jan 27, 2022Updated 4 years ago
- Parallelformers: An Efficient Model Parallelization Toolkit for Deploymentβ791Apr 24, 2023Updated 2 years ago
- Exploring Few-Shot Adaptation of Language Models with Tablesβ24Aug 22, 2022Updated 3 years ago
- jiant is an nlp toolkitβ1,675Jul 6, 2023Updated 2 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhiβ¦β49Apr 26, 2021Updated 4 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)β102Nov 2, 2020Updated 5 years ago
- Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"β457Sep 6, 2023Updated 2 years ago
- Fast, general, and tested differentiable structured prediction in PyTorchβ1,123Apr 20, 2022Updated 3 years ago
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed librariesβ7,382Feb 3, 2026Updated 2 weeks ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckListβ2,048Jan 9, 2024Updated 2 years ago
- Library for Knowledge Intensive Language Tasksβ963Mar 31, 2022Updated 3 years ago
- Tutorial to pretrain & fine-tune a π€ Flax T5 model on a TPUv3-8 with GCPβ58Jul 28, 2022Updated 3 years ago
- β2,947Jan 15, 2026Updated last month
- A python library for highly configurable transformers - easing model architecture search and experimentation.β49Nov 30, 2021Updated 4 years ago
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ786May 19, 2024Updated last year
- OSLO: Open Source framework for Large-scale model Optimizationβ309Aug 25, 2022Updated 3 years ago
- This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"β1,628Jun 12, 2023Updated 2 years ago
- β75Jul 2, 2021Updated 4 years ago
- FastFormers - highly efficient transformer models for NLUβ709Mar 21, 2025Updated 10 months ago
- Task-based datasets, preprocessing, and evaluation for sequence models.β594Feb 3, 2026Updated last week
- Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)β465Nov 5, 2022Updated 3 years ago
- Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language modelsβ3,203Jul 19, 2024Updated last year
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.oβ¦β606Jun 15, 2022Updated 3 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecodeβ111Aug 31, 2022Updated 3 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weightsβ19Oct 9, 2022Updated 3 years ago
- β12Jan 2, 2022Updated 4 years ago
- β221Jun 8, 2020Updated 5 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answeringβ175Jun 6, 2021Updated 4 years ago
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,689Oct 23, 2024Updated last year
- β92Sep 29, 2021Updated 4 years ago
- β533Feb 13, 2024Updated 2 years ago
- This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".β80Jun 3, 2021Updated 4 years ago
- Toolkit for creating, sharing and using natural language prompts.β2,997Oct 23, 2023Updated 2 years ago
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 tyβ¦β650Jan 4, 2023Updated 3 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/pβ¦β433Aug 17, 2022Updated 3 years ago
- Silly twitter torch implementations.β46Oct 14, 2022Updated 3 years ago
- Variable-order CRFs with structure learningβ17Aug 1, 2024Updated last year
- β99Jul 25, 2023Updated 2 years ago