Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Alternatives and similar repositories for merging-text-transformers
Users that are interested in merging-text-transformers are comparing it to the libraries listed below
Sorting:
- ☆17Mar 30, 2024Updated last year
- Official code repository for the WACV 2022 paper "Visualizing Paired Image Similarity in Transformer Networks"☆22Apr 13, 2022Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Nov 30, 2024Updated last year
- A curated list of Model Merging methods.☆95Dec 3, 2025Updated 3 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆33Mar 5, 2024Updated 2 years ago
- LLM-Merging: Building LLMs Efficiently through Merging☆209Sep 24, 2024Updated last year
- ☆77Apr 29, 2024Updated last year
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆312Jan 18, 2024Updated 2 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- A Deepfake detector based on hybrid EfficientNet CNN and Vision Transformer archietcture. The model is explainable by rendering a heatma…☆15Mar 16, 2022Updated 3 years ago
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆92Jul 25, 2023Updated 2 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- This is the official GDSC repo with all of the source code presented in the video tutorials☆14Jun 27, 2023Updated 2 years ago
- Official code for PLoP☆17Jun 30, 2025Updated 8 months ago
- ☆10Oct 2, 2024Updated last year
- HyFormer: Hybrid Transformer and CNN For Pixel-level Multispectral Image Classification☆16Feb 15, 2023Updated 3 years ago
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- ☆16Feb 27, 2026Updated last week
- TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey☆10Nov 7, 2024Updated last year
- Official code for AL-PINNS: Augmented Lagrangian relaxation method for Physics-Informed Neural Networks☆12Jul 29, 2023Updated 2 years ago
- Implementation of a simple linear regression algorithm in MAMBA☆10Feb 12, 2020Updated 6 years ago
- ☆48Jan 21, 2024Updated 2 years ago
- ☆13Apr 3, 2024Updated last year
- 0-Shot Tokenizer Transplant☆14May 16, 2025Updated 9 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆13Mar 30, 2024Updated last year
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- This repository contains an attempt at using Graph Attention based Reinforcement Learning for graphical state space. The code also provid…☆10Jun 27, 2021Updated 4 years ago
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Feb 12, 2026Updated 3 weeks ago
- Poetry Corpora Annotated on Aesthetic Emotions☆12Aug 2, 2022Updated 3 years ago
- 研究生课程笔记。包含组合数学、高级算法设计与分析、最优化理论与应用、大数据分析与挖掘。☆15Dec 17, 2023Updated 2 years ago
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".☆21May 23, 2025Updated 9 months ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- ☆13Nov 28, 2025Updated 3 months ago
- decontamination☆26Dec 3, 2025Updated 3 months ago
- ☆16Jul 29, 2025Updated 7 months ago
- Word embeddings from PPMI-weighted and dirichlet-smoothed co-occurrence matrices☆10Aug 3, 2020Updated 5 years ago