thu-ml/Adaptive-Sparse-Trainer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thu-ml/Adaptive-Sparse-Trainer)

thu-ml / Adaptive-Sparse-Trainer

Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)

☆19

Alternatives and similar repositories for Adaptive-Sparse-Trainer

Users that are interested in Adaptive-Sparse-Trainer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

buyi-Yang / getQzonehistory
View on GitHub
☆12Nov 13, 2024Updated last year
tiremoscode / dw-grupo58
View on GitHub
☆20Nov 28, 2024Updated last year
abduvalimurodullayev1 / boilerplate_Drf
View on GitHub
This is the boilerplate for django project. There are so many settings configurations
☆10Nov 7, 2025Updated 8 months ago
thu-ml / 2by4-pretrain-acc-examples
View on GitHub
Code for "Accelerating Transformer Pre-training with 2:4 Sparsity"
☆28Dec 8, 2024Updated last year
Paramathic / slim
View on GitHub
SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)
☆36Nov 28, 2025Updated 7 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
GovardhaneNitin / smart-inventory
View on GitHub
A smart inventory management system that includes real-time stock tracking, supplier management, predictive analytics for inventory forec…
☆16Apr 22, 2025Updated last year
stephenqz / OATS
View on GitHub
Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition
☆20Apr 16, 2025Updated last year
thu-ml / ReMoE
View on GitHub
[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
☆118Dec 20, 2024Updated last year
yuezhouhu / residual-context-diffusion
View on GitHub
[ICML 2026] Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs.
☆58Jun 28, 2026Updated 3 weeks ago
hgabor / nestjs-keret-2024
View on GitHub
NestJS project template, configured with prisma and ejs
☆12Dec 1, 2024Updated last year
thu-ml / TetraJet-MXFP4Training
View on GitHub
Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training
☆40May 4, 2026Updated 2 months ago
Adlik / model_zoo
View on GitHub
☆11Dec 26, 2025Updated 6 months ago
fmfi-compbio / admm-pruning
View on GitHub
☆30Jul 22, 2024Updated 2 years ago
tim-lawson / skip-middle
View on GitHub
Learning to Skip the Middle Layers of Transformers
☆17Aug 7, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
VITA-Group / WeLore
View on GitHub
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆52Oct 30, 2025Updated 8 months ago
hyhuang00 / moe_inference
View on GitHub
Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".
☆19Oct 30, 2024Updated last year
IST-DASLab / EvoPress
View on GitHub
☆43Jun 14, 2026Updated last month
biomedical-cybernetics / Relative-importance-and-activation-pruning
View on GitHub
☆60Jun 10, 2024Updated 2 years ago
jeffreyyu0602 / quantized-training
View on GitHub
☆35Dec 22, 2025Updated 7 months ago
Su-my / TRAPO
View on GitHub
The official repository for Trust-Region Adaptive Policy Optimization (TRAPO) – a novel hybrid framework designed to enhance large langua…
☆16Mar 2, 2026Updated 4 months ago
TUDa-HWAI / Basis_Sharing
View on GitHub
☆23Oct 2, 2024Updated last year
zhuohaoyu / ORPS
View on GitHub
☆15Jul 15, 2025Updated last year
AndreaGrandieri / ing-sw-2024-codex-naturalis
View on GitHub
Progetto per la prova finale di Ingegneria del Software 2023-2024 al Politecnico di Milano
☆10Oct 19, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
dragonjsq / -VPN
View on GitHub
免费梯子，免费VPN，真正免费的的VPN，shadowsocks,v2rey,官网地址www.dragonvpn.cc
☆13Sep 4, 2024Updated last year
pengbohua / AngularGap
View on GitHub
☆13Jul 20, 2023Updated 3 years ago
thu-ml / vidar
View on GitHub
Official repo for vidar and vidarc: video foundation model for robotics.
☆42Dec 22, 2025Updated 7 months ago
haochengxi / Train_Transformers_with_INT4
View on GitHub
☆157Jun 22, 2023Updated 3 years ago
yuezhouhu / adaspec
View on GitHub
A selective knowledge distillation algorithm for efficient speculative decoders
☆39Nov 27, 2025Updated 7 months ago
jxzhn / supply-chain
View on GitHub
基于FISCO-BCOS区块链的供应链demo，使用node.js构建后端
☆10Jan 28, 2021Updated 5 years ago
EliasEsperanza / UES-API
View on GitHub
API de mapeo para la Universidad de El Salvador (UES), desarrollada por estudiantes de la Facultad Multidisciplinaria Oriental. Proporcio…
☆16Oct 3, 2025Updated 9 months ago
hustvl / TBCM
View on GitHub
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
☆21Dec 16, 2025Updated 7 months ago
StarDewXXX / O1-Pruner
View on GitHub
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆99Feb 21, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
xmu-xiaoma666 / Multimodal-Open-O1
View on GitHub
Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…
☆28Sep 25, 2024Updated last year
dtch1997 / steering-bench
View on GitHub
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆22Dec 14, 2024Updated last year
Lucky-Lance / SPP
View on GitHub
[ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
☆22May 28, 2024Updated 2 years ago
moayedellah / Network-Security
View on GitHub
A curated collection of courses, videos, and resources to master network security from the ground up.
☆11Jan 6, 2025Updated last year
rezashkv / diffusion_pruning
View on GitHub
[ICLR 2025] Adaptive prompt tailored pruning of T2I diffusion models.
☆15Feb 1, 2025Updated last year
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated 4 months ago
ShabanMughal / Robot-Ai
View on GitHub
☆22Jan 1, 2026Updated 6 months ago