LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆36Apr 4, 2024Updated last year
Alternatives and similar repositories for lisa
Users that are interested in lisa are comparing it to the libraries listed below
Sorting:
- ☆19Jan 3, 2025Updated last year
- ☆19Jun 21, 2025Updated 8 months ago
- [WACV 2025-Oral Presentation] Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging☆12Mar 31, 2025Updated 11 months ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated last year
- ☆16Dec 7, 2025Updated 2 months ago
- The official code for Dropping Backward Propagation (DropBP)☆32Oct 29, 2024Updated last year
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- ☆19Dec 23, 2024Updated last year
- Official code for "Self-supervised learning with rotation-invariant kernels"☆13Jun 5, 2023Updated 2 years ago
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆32Nov 28, 2025Updated 3 months ago
- An implementation of the penalty-based bilevel gradient descent (PBGD) algorithm and the iterative differentiation (ITD/RHG) methods.☆19Feb 13, 2023Updated 3 years ago
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆39Nov 1, 2024Updated last year
- Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer☆23Feb 11, 2025Updated last year
- A framework for evolving and testing question-answering datasets with various models.☆21Feb 28, 2024Updated 2 years ago
- ☆23Nov 26, 2024Updated last year
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 8 months ago
- ThinK: Thinner Key Cache by Query-Driven Pruning☆27Feb 11, 2025Updated last year
- ☆26Nov 23, 2023Updated 2 years ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆63Mar 11, 2025Updated 11 months ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 10 months ago
- Pytorch implementation of our UniQ method, IEEE Access -- Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric …☆11Apr 7, 2021Updated 4 years ago
- ☆33Dec 17, 2025Updated 2 months ago
- ☆40Nov 22, 2025Updated 3 months ago
- Tensorflow implementation of ICLR2019 paper "Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency"☆28Jul 4, 2020Updated 5 years ago
- ☆35May 24, 2024Updated last year
- PyTorch code for Uncertainty-guided Source-free Domain Adaptation☆35Jul 23, 2022Updated 3 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Implementation of the CVPR2025 paper LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty.☆17Sep 10, 2025Updated 5 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆29Jan 13, 2026Updated last month
- Rad-cGAN v1.0: Radar-based precipitation nowcasting model with conditional Generative Adversarial Networks for multiple dam domains☆11Jul 22, 2022Updated 3 years ago
- [CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>☆155Jan 14, 2026Updated last month
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆38Sep 24, 2024Updated last year
- ☆39Aug 27, 2024Updated last year
- ☆43Jul 22, 2024Updated last year
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆18Feb 14, 2026Updated 2 weeks ago
- Home of the Martini 3 Sterol Parameters☆13Oct 10, 2023Updated 2 years ago
- Tensorflow Custom Callbacks in Custom Training Loop☆10Apr 8, 2021Updated 4 years ago