[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
☆92Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for LiGO
Users that are interested in LiGO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…☆13Apr 4, 2024Updated 2 years ago
- This is the unofficial implementation of LEMON (ICLR'2024).☆13Apr 14, 2024Updated 2 years ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Masked Structural Growth for 2x Faster Language Model Pre-training☆25Apr 28, 2024Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]☆38Aug 27, 2024Updated last year
- [ICML 2021] "Efficient Lottery Ticket Finding: Less Data is More" by Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang☆26Dec 30, 2021Updated 4 years ago
- Code Repository for the NeurIPS 2021 paper: "Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic P…☆22Jul 10, 2024Updated last year
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆30May 10, 2021Updated 5 years ago
- ☆15May 28, 2024Updated last year
- 🧮 Algebraic Positional Encodings.☆20Aug 20, 2025Updated 9 months ago
- ☆10Dec 18, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- ☆23Jan 24, 2023Updated 3 years ago
- Code for "Fusion Label Enhancement for Multi-Label Learning" in IJCAI-ECAI 2022.☆10Apr 4, 2023Updated 3 years ago
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- Code for "Continual Learning of Object Instances", Implemented in PyTorch, https://arxiv.org/abs/2004.10862☆11Jun 12, 2020Updated 5 years ago
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- ☆40Jun 16, 2023Updated 2 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- The code of CIKM 2023 (Oral Presentation) : A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NE…☆14Jul 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304☆14Oct 11, 2022Updated 3 years ago
- Code for paper "Patch-Level Training for Large Language Models"☆105Nov 10, 2025Updated 6 months ago
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆54Dec 30, 2024Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆57Feb 24, 2026Updated 2 months ago
- ☆13May 25, 2022Updated 3 years ago
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated 2 years ago
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆52Jun 7, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A pytorch version of hamiltonian monte carlo☆15Jun 26, 2019Updated 6 years ago
- [NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Z…☆125Dec 29, 2021Updated 4 years ago
- ☆17Jul 11, 2023Updated 2 years ago
- Pytorch implementation of various token mixers; Attention Mechanisms, MLP, and etc for understanding computer vision papers and other tas…☆17Mar 11, 2026Updated 2 months ago
- Running inference on the ZeroSCROLLS benchmark☆22Apr 18, 2024Updated 2 years ago
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year