mit-han-lab / offsite-tuning
Offsite-Tuning: Transfer Learning without Full Model
☆368Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for offsite-tuning
- Shepherd: A foundational framework enabling federated instruction tuning for large language models☆204Updated last year
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆598Updated 4 months ago
- ☆153Updated 9 months ago
- Official PyTorch implementation of QA-LoRA☆117Updated 8 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆384Updated 6 months ago
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆176Updated 6 months ago
- A simple and effective LLM pruning approach.☆672Updated 3 months ago
- AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).☆278Updated last year
- DSIR large-scale data selection framework for language model training☆231Updated 7 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆135Updated 5 months ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆220Updated this week
- ☆199Updated 5 months ago
- Research Trends in LLM-guided Multimodal Learning.☆355Updated last year
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆143Updated last week
- Must-read Papers of Parameter-Efficient Tuning (Delta Tuning) Methods on Pre-trained Models.☆277Updated last year
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆560Updated 8 months ago
- Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"☆254Updated 2 months ago
- Scaling Data-Constrained Language Models☆321Updated 2 months ago
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆530Updated 8 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆123Updated 8 months ago
- Editing Models with Task Arithmetic☆430Updated 10 months ago
- An Extensible Continual Learning Framework Focused on Language Models (LMs)☆253Updated 9 months ago
- OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.☆541Updated last year
- Official implementation of TransNormerLLM: A Faster and Better LLM☆229Updated 10 months ago
- Rectified Rotary Position Embeddings☆341Updated 6 months ago
- A curated list of Model Merging methods.☆83Updated 2 months ago
- Simple next-token-prediction for RLHF☆220Updated last year
- ☆575Updated last week
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆214Updated last year
- Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models☆133Updated 2 years ago