dvmazur / mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
☆2,288Updated 5 months ago
Related projects: ⓘ
- Tools for merging pretrained large language models.☆4,501Updated this week
- Training LLMs with QLoRA + FSDP☆1,382Updated last week
- Go ahead and axolotl questions☆7,554Updated this week
- A Native-PyTorch Library for LLM Fine-tuning☆3,942Updated this week
- To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x com…☆4,435Updated 3 weeks ago
- SGLang is a fast serving framework for large language models and vision language models.☆5,121Updated this week
- ☆2,652Updated this week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,506Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆955Updated last month
- Robust recipes to align language models with human and AI preferences☆4,481Updated 3 weeks ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,739Updated last week
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,106Updated 3 weeks ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,378Updated 2 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆3,484Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆2,817Updated 2 weeks ago
- Code examples and resources for DBRX, a large language model developed by Databricks☆2,496Updated 4 months ago
- A unified evaluation framework for large language models☆2,375Updated last week
- Modeling, training, eval, and inference code for OLMo☆4,399Updated this week
- tiny vision language model☆4,893Updated 3 weeks ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,537Updated 2 months ago
- ☆870Updated this week
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆1,958Updated 4 months ago
- Blazingly fast LLM inference.☆3,406Updated this week
- A blazing fast inference solution for text embeddings models☆2,599Updated this week
- A framework for few-shot evaluation of language models.☆6,426Updated this week
- High-quality datasets, tools, and concepts for LLM fine-tuning.☆1,664Updated last month
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,326Updated last month
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆7,620Updated 4 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆9,780Updated this week
- Examples in the MLX framework☆5,836Updated this week