[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
☆92Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for LiGO
Users that are interested in LiGO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆11Oct 31, 2024Updated last year
- This is the unofficial implementation of LEMON (ICLR'2024).☆13Apr 14, 2024Updated 2 years ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- Masked Structural Growth for 2x Faster Language Model Pre-training☆25Apr 28, 2024Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]☆39Aug 27, 2024Updated last year
- [ICML 2021] "Efficient Lottery Ticket Finding: Less Data is More" by Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang☆26Dec 30, 2021Updated 4 years ago
- ☆15May 28, 2024Updated 2 years ago
- 🧮 Algebraic Positional Encodings.☆21Jun 5, 2026Updated 3 weeks ago
- ☆10Dec 18, 2023Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data☆20Aug 6, 2024Updated last year
- Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing☆18Sep 24, 2022Updated 3 years ago
- [ICCV2025] "Di[M]O: Distilling Masked Diffusion Models into One-step Generator", Yuanzhi Zhu, Xi Wang, Stéphane Lathuilière, Vicky Kal…☆38Aug 14, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- Code for "Continual Learning of Object Instances", Implemented in PyTorch, https://arxiv.org/abs/2004.10862☆11Jun 12, 2020Updated 6 years ago
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…☆14Nov 27, 2023Updated 2 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- “SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity” by Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang,…☆35Jan 5, 2024Updated 2 years ago
- Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304☆14Oct 11, 2022Updated 3 years ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆117Mar 20, 2025Updated last year
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Locality-Aware Hyperspectral Classification☆18Jan 10, 2025Updated last year
- ☆13May 25, 2022Updated 4 years ago
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated 2 years ago
- ☆56Jul 30, 2024Updated last year
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- Code and Model for NeurIPS 2024 Spotlight Paper "Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training…☆44Oct 16, 2024Updated last year
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆1,131Oct 7, 2024Updated last year
- Official implementation for "RecursiveDet: End-to-End Region-based Recursive Object Detection" (ICCV 2023)☆18Jan 25, 2024Updated 2 years ago
- ☆17Jul 11, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Z…☆125Dec 29, 2021Updated 4 years ago
- Pytorch implementation of various token mixers; Attention Mechanisms, MLP, and etc for understanding computer vision papers and other tas…☆17Mar 11, 2026Updated 3 months ago
- Running inference on the ZeroSCROLLS benchmark☆22Apr 18, 2024Updated 2 years ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆132May 16, 2023Updated 3 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- [CVPR 2024] "Taming Mode Collapse in Score Distillation for Text-to-3D Generation" by Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Srey…☆51Feb 2, 2024Updated 2 years ago
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago