dvmazur / mixtral-offloadingLinks
Run Mixtral-8x7B models in Colab or consumer desktops
☆2,312Updated last year
Alternatives and similar repositories for mixtral-offloading
Users that are interested in mixtral-offloading are comparing it to the libraries listed below
Sorting:
- Tools for merging pretrained large language models.☆5,853Updated last week
- Reaching LLaMA2 Performance with 0.1M Dollars☆983Updated 11 months ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,998Updated 2 months ago
- ☆2,973Updated 9 months ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,167Updated 8 months ago
- Training LLMs with QLoRA + FSDP☆1,487Updated 7 months ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,915Updated 11 months ago
- PyTorch native post-training library☆5,287Updated this week
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,290Updated last year
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,589Updated last year
- Modeling, training, eval, and inference code for OLMo☆5,702Updated last week
- High-speed Large Language Model Serving for Local Deployment☆8,224Updated 4 months ago
- A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI☆767Updated last year
- ☆4,088Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,877Updated 2 months ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,265Updated last month
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,216Updated 3 weeks ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,426Updated this week
- Robust recipes to align language models with human and AI preferences☆5,235Updated last month
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆3,101Updated 2 weeks ago
- An Open-source Toolkit for LLM Development☆2,784Updated 5 months ago
- ☆447Updated last year
- [ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.☆2,376Updated this week
- ☆980Updated 4 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,545Updated last year
- ☆2,527Updated last year
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,258Updated 3 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,386Updated last year
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,773Updated this week
- Official inference library for Mistral models☆10,307Updated 3 months ago