[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
☆92Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for LiGO
Users that are interested in LiGO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…☆13Apr 4, 2024Updated 2 years ago
- This is the unofficial implementation of LEMON (ICLR'2024).☆13Apr 14, 2024Updated 2 years ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Masked Structural Growth for 2x Faster Language Model Pre-training☆25Apr 28, 2024Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]☆38Aug 27, 2024Updated last year
- [ICML 2021] "Efficient Lottery Ticket Finding: Less Data is More" by Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang☆26Dec 30, 2021Updated 4 years ago
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆30May 10, 2021Updated 4 years ago
- ☆15May 28, 2024Updated last year
- 🧮 Algebraic Positional Encodings.☆20Aug 20, 2025Updated 8 months ago
- ☆12May 23, 2022Updated 3 years ago
- Code for "Fusion Label Enhancement for Multi-Label Learning" in IJCAI-ECAI 2022.☆10Apr 4, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…☆14Nov 27, 2023Updated 2 years ago
- A PyTorch implementation of NPID based on CVPR 2018 paper "Unsupervised Feature Learning via Non-Parametric Instance Discrimination"☆14Feb 10, 2020Updated 6 years ago
- ☆40Jun 16, 2023Updated 2 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- The code of CIKM 2023 (Oral Presentation) : A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NE…☆14Jul 19, 2024Updated last year
- “SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity” by Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang,…☆35Jan 5, 2024Updated 2 years ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆65Dec 12, 2024Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆116Mar 20, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- PyTorch implementation of a 9-layer ResNet for CIFAR-10.☆12May 8, 2024Updated last year
- The official implementation of the paper SimVP: Towards Simple yet Powerful Spatiotemporal Predictive learning.☆11Jan 2, 2024Updated 2 years ago
- Official code for the paper "Attention as a Hypernetwork"☆56Feb 24, 2026Updated 2 months ago
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- PyTorch implementation of federated learning on MNIST☆24Feb 19, 2024Updated 2 years ago
- Locality-Aware Hyperspectral Classification☆18Jan 10, 2025Updated last year
- [ICCV 2023 Oral] Official PyTorch implementation of our paper for semi-supervised continual learning "A soft nearest-neighbor framework f…☆25Dec 17, 2024Updated last year
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- A pytorch version of hamiltonian monte carlo☆15Jun 26, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆1,123Oct 7, 2024Updated last year
- ☆17Jul 11, 2023Updated 2 years ago
- Running inference on the ZeroSCROLLS benchmark☆22Apr 18, 2024Updated 2 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆132May 16, 2023Updated 2 years ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆57Mar 31, 2026Updated last month
- Experiments on the impact of depth in transformers and SSMs.☆40Oct 23, 2025Updated 6 months ago