Converting Mixtral-8x7B to Mixtral-[1~7]x7B
☆22Mar 4, 2024Updated 2 years ago
Alternatives and similar repositories for mixtral_spliter
Users that are interested in mixtral_spliter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 7000+FPS face alignment☆22Jun 30, 2017Updated 8 years ago
- An SD upscale script made to work with an inpainting model. Supports tiling.☆11Mar 13, 2023Updated 3 years ago
- RBF Drivers for Blender☆10Oct 14, 2022Updated 3 years ago
- Minimal implementation of TokenFormer for inference and learning☆13Nov 6, 2024Updated last year
- Code for experiments on transformers using Markovian data.☆22Nov 22, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆26Jul 21, 2025Updated 10 months ago
- ☆21Apr 16, 2025Updated last year
- Exploring Evolution-aware & free protein language models as protein function predictors☆63Sep 28, 2024Updated last year
- image demoireing, moire synthesis☆16Apr 25, 2024Updated 2 years ago
- Data for evaluating GPT-4V☆11Oct 26, 2023Updated 2 years ago
- [ICML 2025] Parameter-Efficient Fine-Tuning of State Space Models☆25Jun 9, 2025Updated 11 months ago
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆17May 24, 2024Updated 2 years ago
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆32Jul 9, 2024Updated last year
- Pytorch implementation for paper "Selective Encoding for Abstractive Sentence Summarization"☆12Mar 17, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆37Mar 10, 2026Updated 2 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- [COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs☆65Mar 9, 2026Updated 2 months ago
- ☆14Jun 20, 2022Updated 3 years ago
- Source code of the proposed method MulT-TTE in the paper "Multi-faceted Route Representation Learning for Travel Time Estimation"☆15Apr 7, 2025Updated last year
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 3 years ago
- Codes for Evolving Plastic ANNs☆14Dec 18, 2022Updated 3 years ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆110Sep 18, 2025Updated 8 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- FL-Tuning☆12Jul 11, 2022Updated 3 years ago
- ☆21Dec 14, 2024Updated last year
- ☆12May 6, 2022Updated 4 years ago
- Chinese Version of ACL 2020 PC Blogs (ACL 2020程序委员会博文中文版)☆15Apr 15, 2020Updated 6 years ago
- natural annotated text-category pairs for text classification☆10Sep 10, 2021Updated 4 years ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Nov 26, 2023Updated 2 years ago
- Face detection using Multi-scale Block Local Binary Pattern algorithm - optimized with OpenCL/OpenMP - Depreciated - pls use convolutiona…☆11Jul 16, 2017Updated 8 years ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆14Jun 7, 2025Updated 11 months ago
- Computational predictor of protein intrinsic disorder and its functions☆10Dec 4, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Sep 25, 2020Updated 5 years ago
- ☆14Nov 14, 2023Updated 2 years ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆16Feb 18, 2022Updated 4 years ago
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward☆60Nov 27, 2025Updated 5 months ago
- [EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models☆18Oct 21, 2023Updated 2 years ago
- ☆15Aug 21, 2023Updated 2 years ago
- A face detection base on faster-rcnn.pytorch☆10Feb 9, 2018Updated 8 years ago