[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
☆92Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for LiGO
Users that are interested in LiGO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the unofficial implementation of LEMON (ICLR'2024).☆12Apr 14, 2024Updated last year
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- [ICML 2021] "Efficient Lottery Ticket Finding: Less Data is More" by Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang☆26Dec 30, 2021Updated 4 years ago
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆29May 10, 2021Updated 4 years ago
- ☆15May 28, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 🧮 Algebraic Positional Encodings.☆20Aug 20, 2025Updated 7 months ago
- ☆10Dec 18, 2023Updated 2 years ago
- Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data☆21Aug 6, 2024Updated last year
- ☆24Jan 24, 2023Updated 3 years ago
- ☆12May 23, 2022Updated 3 years ago
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…☆15Nov 27, 2023Updated 2 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- The code of CIKM 2023 (Oral Presentation) : A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NE…☆14Jul 19, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304☆14Oct 11, 2022Updated 3 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- Code for paper "Patch-Level Training for Large Language Models"☆97Nov 10, 2025Updated 5 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆116Mar 20, 2025Updated last year
- Model Zoos published at the NeurIPS 2022 Dataset & Benchmark track: "Model Zoos: A Dataset of Diverse Populations of Neural Network Model…☆59Oct 2, 2025Updated 6 months ago
- Official code for the paper "Attention as a Hypernetwork"☆55Feb 24, 2026Updated last month
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆54Dec 30, 2024Updated last year
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- ☆13May 25, 2022Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated last year
- ☆56Jul 30, 2024Updated last year
- [ICCV 2023 Oral] Official PyTorch implementation of our paper for semi-supervised continual learning "A soft nearest-neighbor framework f…☆25Dec 17, 2024Updated last year
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- A pytorch version of hamiltonian monte carlo☆15Jun 26, 2019Updated 6 years ago
- Running inference on the ZeroSCROLLS benchmark☆22Apr 18, 2024Updated last year
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated 11 months ago
- [CVPR 2024] "Taming Mode Collapse in Score Distillation for Text-to-3D Generation" by Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Srey…☆51Feb 2, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆132May 16, 2023Updated 2 years ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆56Mar 31, 2026Updated last week
- Experiments on the impact of depth in transformers and SSMs.☆41Oct 23, 2025Updated 5 months ago
- data collator for UL2 and U-PaLM☆29Aug 20, 2023Updated 2 years ago
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆26Jul 26, 2023Updated 2 years ago
- Code for paper Document-Level Paraphrase Generation with Sentence Rewriting and Reordering by Zhe Lin, Yitao Cai and Xiaojun Wan. This pa…☆26Nov 10, 2021Updated 4 years ago