amazon-science / adaptive-feature-transferLinks
Official implementation of Adaptive Feature Transfer (AFT)
☆23Updated last year
Alternatives and similar repositories for adaptive-feature-transfer
Users that are interested in adaptive-feature-transfer are comparing it to the libraries listed below
Sorting:
- HGRN2: Gated Linear RNNs with State Expansion☆55Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆60Updated 11 months ago
- Official code for the paper "Attention as a Hypernetwork"☆46Updated last year
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated 2 years ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆46Updated last year
- Unofficial Implementation of Selective Attention Transformer☆17Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆68Updated last year
- Autoregressive Image Generation☆31Updated 5 months ago
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆27Updated last year
- Recycling diverse models☆46Updated 2 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 6 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆31Updated 2 years ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆57Updated last year
- More dimensions = More fun☆26Updated last year
- [NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization☆37Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆21Updated 2 years ago
- An official PyTorch implementation for CLIPPR☆29Updated 2 years ago
- Implementation of Agent Attention in Pytorch☆92Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated 5 months ago
- Official repo of Progressive Data Expansion: data, code and evaluation☆29Updated 2 years ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆57Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆31Updated last year
- Code for T-MARS data filtering☆35Updated 2 years ago
- [NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections☆21Updated last year
- [ICLR 2024 Oral] Improving Convergence and Generalization Using Parameter Symmetries☆30Updated last year
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆134Updated 3 weeks ago
- ☆33Updated 9 months ago
- This repository is associated with the research paper titled ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large…☆14Updated 5 months ago
- ☆45Updated last year
- Code for Principal Masked Autoencoders☆30Updated 8 months ago