tcapelle / mixtralLinks
Mixtral finetuning
☆19Updated last year
Alternatives and similar repositories for mixtral
Users that are interested in mixtral are comparing it to the libraries listed below
Sorting:
- ☆87Updated last year
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…☆42Updated last year
- Simple GRPO scripts and configurations.☆59Updated 6 months ago
- QLoRA for Masked Language Modeling☆22Updated last year
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- ☆49Updated 6 months ago
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆63Updated 2 months ago
- ☆22Updated last year
- ☆23Updated 2 years ago
- NLP with Rust for Python 🦀🐍☆64Updated 2 months ago
- ☆79Updated last year
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 7 months ago
- A sample pattern for running CI tests on Modal☆18Updated 3 months ago
- ☆64Updated last month
- An introduction to LLM Sampling☆79Updated 7 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 10 months ago
- ☆47Updated last year
- A library for squeakily cleaning and filtering language datasets.☆47Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆35Updated 2 years ago
- ☆48Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- ☆53Updated 9 months ago
- ☆69Updated 11 months ago
- Score LLM pretraining data with classifiers☆55Updated last year
- PyLate efficient inference engine☆62Updated 3 weeks ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 2 weeks ago