[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
☆92Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for LiGO
Users that are interested in LiGO are comparing it to the libraries listed below
Sorting:
- This is the unofficial implementation of LEMON (ICLR'2024).☆12Apr 14, 2024Updated last year
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 3 years ago
- ☆14May 25, 2022Updated 3 years ago
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…☆15Nov 27, 2023Updated 2 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- DeepSearchRelevanceRanking-KDD2022-Tutorial☆15Aug 23, 2022Updated 3 years ago
- A family of efficient edge language models in 100M~1B sizes.☆19Feb 14, 2025Updated last year
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.☆52Jul 10, 2023Updated 2 years ago
- Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data☆21Aug 6, 2024Updated last year
- Example of containerized deployment of Dash app to Google Cloud Run, https://docker-dash-example.com/☆22May 26, 2023Updated 2 years ago
- Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing☆18Sep 24, 2022Updated 3 years ago
- [ICCV 2023] Label-Efficient Online Continual Object Detection in Streaming Video☆23Jan 8, 2024Updated 2 years ago
- [ICCV2025] "Di[M]O: Distilling Masked Diffusion Models into One-step Generator", Yuanzhi Zhu, Xi Wang, Stéphane Lathuilière, Vicky Kal…☆32Aug 14, 2025Updated 6 months ago
- Code Repository for the NeurIPS 2021 paper: "Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic P…☆22Jul 10, 2024Updated last year
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆202Jun 22, 2023Updated 2 years ago
- Hierarchical Sketch Induction for Paraphrase Generation (Hosking et al., ACL 2022)☆51Nov 8, 2023Updated 2 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆26Jul 26, 2023Updated 2 years ago
- Code for paper Document-Level Paraphrase Generation with Sentence Rewriting and Reordering by Zhe Lin, Yitao Cai and Xiaojun Wan. This pa…☆26Nov 10, 2021Updated 4 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated last year
- Which fellows cited my article?☆24Mar 6, 2022Updated 4 years ago
- Finetune any model on HF in less than 30 seconds☆56Jan 31, 2026Updated last month
- [ACM'MM 2020] "MM-Hand: 3D-Aware Multi-Modal Guided Hand Generative Network for 3D Hand Pose Synthesis" Zhenyu Wu, Duc Hoang, Shih-Yao Li…☆25Dec 8, 2022Updated 3 years ago
- Implementation of ConGen: Unsupervised Control and Generalization Distillation For Sentence Representation (Finding of EMNLP 2022).☆22Sep 13, 2023Updated 2 years ago
- This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Pr…☆26Jun 27, 2022Updated 3 years ago
- Experiments on the impact of depth in transformers and SSMs.☆41Oct 23, 2025Updated 4 months ago
- [Preprint 2022] “Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?” by Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zh…☆63Jan 18, 2023Updated 3 years ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Aug 2, 2024Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- Benchmark for federated noisy label learning☆25Aug 31, 2024Updated last year
- [TMLR 2022] High-Modality Multimodal Transformer☆117Nov 2, 2024Updated last year
- ☆71Oct 16, 2024Updated last year
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆64Dec 12, 2024Updated last year