Converting Mixtral-8x7B to Mixtral-[1~7]x7B
☆22Mar 4, 2024Updated 2 years ago
Alternatives and similar repositories for mixtral_spliter
Users that are interested in mixtral_spliter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆173Updated this week
- ☆107May 10, 2026Updated last month
- Terminal UI for NVIDIA Nsight Systems profiles — timeline viewer, kernel navigator, NVTX hierarchy☆58Updated this week
- Expert Specialization MoE Solution based on CUTLASS☆27Apr 14, 2026Updated 2 months ago
- ☆13Jan 21, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆22Oct 14, 2025Updated 8 months ago
- Code for experiments on transformers using Markovian data.☆22Nov 22, 2024Updated last year
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆26Jul 21, 2025Updated 10 months ago
- ☆21Apr 16, 2025Updated last year
- Exploring Evolution-aware & free protein language models as protein function predictors☆63Sep 28, 2024Updated last year
- Data for evaluating GPT-4V☆11Oct 26, 2023Updated 2 years ago
- [ICML 2025] Parameter-Efficient Fine-Tuning of State Space Models☆25Jun 9, 2025Updated last year
- ☆19Mar 25, 2025Updated last year
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆32Jul 9, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆37Mar 10, 2026Updated 3 months ago
- ☆75Mar 7, 2024Updated 2 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors☆11Apr 14, 2022Updated 4 years ago
- Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."☆25Oct 7, 2020Updated 5 years ago
- ☆14Jun 20, 2022Updated 3 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 3 years ago
- Codes for Evolving Plastic ANNs☆14Dec 18, 2022Updated 3 years ago
- ☆21Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆110Sep 18, 2025Updated 8 months ago
- Source code for IRL-INR (ICML 2023)☆20May 27, 2023Updated 3 years ago
- FL-Tuning☆12Jul 11, 2022Updated 3 years ago
- ☆21Dec 14, 2024Updated last year
- ☆12May 6, 2022Updated 4 years ago
- Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models☆15Nov 4, 2023Updated 2 years ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Nov 26, 2023Updated 2 years ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆14Jun 7, 2025Updated last year
- A simple program scheduler for your code on different devices.☆12Mar 8, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Computational predictor of protein intrinsic disorder and its functions☆11Dec 4, 2023Updated 2 years ago
- ☆11Sep 25, 2020Updated 5 years ago
- ☆14Nov 14, 2023Updated 2 years ago
- ☆16Apr 11, 2022Updated 4 years ago
- [EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models☆18Oct 21, 2023Updated 2 years ago
- ☆15Aug 21, 2023Updated 2 years ago
- A face detection base on faster-rcnn.pytorch☆10Feb 9, 2018Updated 8 years ago