nverma1 / merging-text-transformersView external linksLinks
Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Alternatives and similar repositories for merging-text-transformers
Users that are interested in merging-text-transformers are comparing it to the libraries listed below
Sorting:
- ☆17Mar 30, 2024Updated last year
- Official code repository for the WACV 2022 paper "Visualizing Paired Image Similarity in Transformer Networks"☆22Apr 13, 2022Updated 3 years ago
- Official code for the paper "PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models".☆15Dec 8, 2022Updated 3 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- ☆24Jun 7, 2021Updated 4 years ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Mar 5, 2024Updated last year
- LLM-Merging: Building LLMs Efficiently through Merging☆209Sep 24, 2024Updated last year
- ☆77Apr 29, 2024Updated last year
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆309Jan 18, 2024Updated 2 years ago
- ☆32Sep 27, 2021Updated 4 years ago
- A Deepfake detector based on hybrid EfficientNet CNN and Vision Transformer archietcture. The model is explainable by rendering a heatma…☆15Mar 16, 2022Updated 3 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- This is the official GDSC repo with all of the source code presented in the video tutorials☆14Jun 27, 2023Updated 2 years ago
- TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey☆10Nov 7, 2024Updated last year
- This repository is a reimplementation of the paper(BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model: htt…☆11Nov 14, 2019Updated 6 years ago
- Official code for AL-PINNS: Augmented Lagrangian relaxation method for Physics-Informed Neural Networks☆12Jul 29, 2023Updated 2 years ago
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- HyFormer: Hybrid Transformer and CNN For Pixel-level Multispectral Image Classification☆15Feb 15, 2023Updated 2 years ago
- Code for the experiments in the ACL 2020 paper "Estimating predictive uncertainty for rumour verification models"☆11May 15, 2020Updated 5 years ago
- ☆48Jan 21, 2024Updated 2 years ago
- This repository contains an attempt at using Graph Attention based Reinforcement Learning for graphical state space. The code also provid…☆10Jun 27, 2021Updated 4 years ago
- ROS wrapper of Nvidia Contact-graspnet model.☆17Jul 3, 2023Updated 2 years ago
- A Transformer-based Prediction Method for Depth of Anesthesia During Target-controlled Infusion of Propofol and Remifentanil.☆15Feb 17, 2025Updated 11 months ago
- ☆16Feb 5, 2026Updated last week
- Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".☆21May 23, 2025Updated 8 months ago
- ☆16Jul 29, 2025Updated 6 months ago
- Code for GFlowNet-DPO (Direct Preference Optimization) EMNLP 2024 Main☆18Nov 27, 2025Updated 2 months ago
- Effective Attention Sheds Light On Interpretability - Findings of ACL2021☆11May 16, 2021Updated 4 years ago
- ☆13Nov 28, 2025Updated 2 months ago
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- [NeurIPS 2025] Official implementation of the paper "BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading"☆26Nov 27, 2025Updated 2 months ago
- Official code for the paper: Invertible Neural Network for Graph Prediction☆10Mar 27, 2023Updated 2 years ago
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Updated this week
- Poetry Corpora Annotated on Aesthetic Emotions☆12Aug 2, 2022Updated 3 years ago
- ☆13Apr 3, 2024Updated last year
- 0-Shot Tokenizer Transplant☆14May 16, 2025Updated 8 months ago
- [AAAI-25 Oral] Adaptive Calibration☆14Jul 6, 2025Updated 7 months ago