VITA-Group/Junk_DNA_Hypothesis

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VITA-Group/Junk_DNA_Hypothesis)

VITA-Group / Junk_DNA_Hypothesis

[ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, Souvik Kundu, Zhangyang Wang

☆16

Alternatives and similar repositories for Junk_DNA_Hypothesis

Users that are interested in Junk_DNA_Hypothesis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IlanPrice / DCTpS
View on GitHub
Code for testing DCT plus Sparse (DCTpS) networks
☆14Jun 15, 2021Updated 5 years ago
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated 4 months ago
pixeli99 / DSify
View on GitHub
Boosting Driving Scene Understanding with Advanced Vision-Language Models
☆33May 19, 2023Updated 3 years ago
tau-nlp / zero_scrolls
View on GitHub
Running inference on the ZeroSCROLLS benchmark
☆22Apr 18, 2024Updated 2 years ago
VITA-Group / FreeTickets
View on GitHub
[ICLR 2022] "Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity" by Shiwei Liu,…
☆27Jun 15, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
smpanaro / apple-silicon-4bit-quant
View on GitHub
Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"
☆11Mar 31, 2024Updated 2 years ago
VITA-Group / llm-kick
View on GitHub
[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.
☆27Apr 21, 2025Updated last year
TianjinYellow / EdgeDeviceLLMCompetition-Starting-Kit
View on GitHub
☆42Oct 31, 2024Updated last year
VITA-Group / ramanujan-on-pai
View on GitHub
[ICLR 2023] 'Revisiting Pruning At Initialization Through The Lens of Ramanujan Graph" by Duc Hoang, Shiwei Liu, Radu Marculescu, Atlas W…
☆14Aug 4, 2023Updated 2 years ago
luuyin / OWL
View on GitHub
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆82Jul 7, 2025Updated last year
boone891214 / MEST
View on GitHub
[NeurIPS‘2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et al…
☆18Mar 16, 2022Updated 4 years ago
VITA-Group / Random_Pruning
View on GitHub
[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training by Shiwei Liu, Tianlo…
☆78Jan 9, 2023Updated 3 years ago
Shiweiliuiiiiiii / In-Time-Over-Parameterization
View on GitHub
[ICML 2021] "Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training" by Shiwei Liu, Lu Yin, De…
☆46Nov 11, 2023Updated 2 years ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆36Feb 26, 2026Updated 4 months ago
chenghands-on / Dreamer_assemble
View on GitHub
An assemble of various world model including dreamer v2 and v3
☆10Sep 9, 2023Updated 2 years ago
netrapathak / FaceNet_nn1
View on GitHub
NN1 network from FaceNet: A Unified Embedding for Face Recognition and Clustering, in Keras.
☆11Jun 13, 2017Updated 9 years ago
fastconvnets / cvpr2020
View on GitHub
Code for "Fast Sparse ConvNets" CVPR2020 submissions
☆12Nov 20, 2019Updated 6 years ago
VITA-Group / Random-MoE-as-Dropout
View on GitHub
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆56Feb 28, 2023Updated 3 years ago
MadsToftrup / Apollo-dev
View on GitHub
☆17Dec 9, 2024Updated last year
VITA-Group / LoCoCo
View on GitHub
[ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen
☆17Sep 7, 2024Updated last year
YuejiangLIU / csl
View on GitHub
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
☆15Feb 26, 2024Updated 2 years ago
sunblaze-ucb / rl-grok-recipe
View on GitHub
Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""
☆35Oct 12, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
UNITES-Lab / MoE-Quantization
View on GitHub
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
☆31Jun 30, 2025Updated last year
prabdeb / openai-iot-speech-chatbot
View on GitHub
OpenAI GPT model to build your personal assistant in IoT devices. Just like Alexa, Google Assistant, Siri, etc. but with your own skills,…
☆12Aug 7, 2023Updated 2 years ago
HuangOwen / RoLoRA
View on GitHub
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆41Sep 24, 2024Updated last year
hoonyyhoon / Synflow_SNIP_GraSP
View on GitHub
Comparison of method "Pruning at initialization prior to training" (Synflow/SNIP/GraSP) in PyTorch
☆18May 12, 2024Updated 2 years ago
fannie1208 / GLIND
View on GitHub
[ICML2024] Learning Divergence Fields for Shift-Robust Graph Representations
☆11Aug 15, 2024Updated last year
VITA-Group / READ-ME
View on GitHub
[NeurIPS2024] "Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design", Ruisi Cai, Yeonju Ro, Geon-Woo …
☆16Dec 16, 2024Updated last year
GridGain-Demos / imc-essentials-in-90-minutes
View on GitHub
O'Reilly Course, In-Memory Computing Essentials
☆10Oct 16, 2020Updated 5 years ago
peijunallin / alphalora
View on GitHub
☆19Nov 10, 2024Updated last year
Andron00e / SparseCBM
View on GitHub
Official implementation for "Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning"
☆12Jun 20, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
zukakosan / tripletNet
View on GitHub
Image retrieval with triplet loss
☆17Jun 20, 2018Updated 8 years ago
VITA-Group / Random-Shuffling-BackdoorDetect
View on GitHub
[NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…
☆21Oct 1, 2022Updated 3 years ago
swaggy-TN / EfficientVLM
View on GitHub
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)
☆33Jul 18, 2023Updated 3 years ago
IST-DASLab / ACDC
View on GitHub
Code for reproducing "AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks" (NeurIPS 2021)
☆23Nov 9, 2021Updated 4 years ago
zahraatashgahi / QuickSelection
View on GitHub
[Machine Learning Journal (ECML-PKDD 2022 journal track)] Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Tra…
☆18Oct 2, 2023Updated 2 years ago
dmis-lab / Outlier-Safe-Pre-Training
View on GitHub
[ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
☆39Nov 4, 2025Updated 8 months ago