Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Alternatives and similar repositories for merging-text-transformers
Users that are interested in merging-text-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark API for Multidomain Language Modeling☆25Aug 26, 2022Updated 3 years ago
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆313Jan 18, 2024Updated 2 years ago
- LLM-Merging: Building LLMs Efficiently through Merging☆209Sep 24, 2024Updated last year
- A curated list of Model Merging methods.☆95Dec 3, 2025Updated 5 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆77Apr 29, 2024Updated 2 years ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆93Jul 25, 2023Updated 2 years ago
- Official code for PLoP☆20Mar 6, 2026Updated 2 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- ☆216Feb 3, 2024Updated 2 years ago
- Package to align tokens from different tokenizations.☆16Mar 25, 2024Updated 2 years ago
- Active Learning in the era of Foundation Models☆13Apr 16, 2025Updated last year
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official code repository for the WACV 2022 paper "Visualizing Paired Image Similarity in Transformer Networks"☆22Apr 13, 2022Updated 4 years ago
- Manage ML configuration with pydantic☆16Mar 18, 2026Updated last month
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- Official implementation of COLosSAL [MICCAI 2023]☆15Jul 22, 2023Updated 2 years ago
- ☆35Apr 14, 2025Updated last year
- Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025☆34Feb 22, 2026Updated 2 months ago
- ☆17Apr 11, 2024Updated 2 years ago
- Effective Attention Sheds Light On Interpretability - Findings of ACL2021☆11May 16, 2021Updated 4 years ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆53Dec 22, 2025Updated 4 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A repository for organizing our submission to the MEDIQA-Chat Tasks @ ACL-ClinicalNLP 2023☆22Jul 21, 2023Updated 2 years ago
- ☆13Apr 3, 2024Updated 2 years ago
- [CVPR 2025] LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs☆14Jun 20, 2025Updated 10 months ago
- [AAAI-25 Oral] Adaptive Calibration☆15Jul 6, 2025Updated 10 months ago
- ☆10Apr 16, 2024Updated 2 years ago
- Code for GFlowNet-DPO (Direct Preference Optimization) EMNLP 2024 Main☆19Feb 22, 2026Updated 2 months ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆15Jun 26, 2025Updated 10 months ago
- ☆23Jan 2, 2023Updated 3 years ago
- The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)☆12Sep 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- JPEG编解码从零开始实现(python JPEG codec)☆10Jul 29, 2022Updated 3 years ago
- CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))☆16Apr 26, 2024Updated 2 years ago
- ☆48Jan 21, 2024Updated 2 years ago
- ☆14Apr 27, 2022Updated 4 years ago
- An event based dataset loader under one common python API.☆10Mar 22, 2022Updated 4 years ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆79Mar 1, 2025Updated last year