nverma1 / merging-text-transformers
Code for "Merging Text Transformers from Different Initializations"
☆19Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for merging-text-transformers
- Code for T-MARS data filtering☆35Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- Efficient Scaling laws and collaborative pretraining.☆13Updated this week
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆44Updated last year
- We introduce EMMET and unify model editing with popular algorithms ROME and MEMIT.☆12Updated 2 months ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆30Updated 6 months ago
- ☆21Updated 8 months ago
- ☆22Updated 2 weeks ago
- ☆26Updated last year
- Repository for Skill Set Optimization☆12Updated 3 months ago
- ☆18Updated 5 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆19Updated 2 months ago
- Few-shot Learning with Auxiliary Data