r-three / realistic_evaluation_of_model_merging_for_compositional_generalizationLinks
☆12Updated 7 months ago
Alternatives and similar repositories for realistic_evaluation_of_model_merging_for_compositional_generalization
Users that are interested in realistic_evaluation_of_model_merging_for_compositional_generalization are comparing it to the libraries listed below
Sorting:
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated last year
- Code for T-MARS data filtering☆35Updated last year
- ☆19Updated 10 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆25Updated last year
- ☆85Updated last year
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆14Updated 9 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated last month
- Recycling diverse models☆44Updated 2 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- ☆29Updated last year
- Exploration of automated dataset selection approaches at large scales.☆41Updated 3 months ago
- ☆25Updated 3 months ago
- ☆14Updated last year
- Data Valuation on In-Context Examples (ACL23)☆23Updated 4 months ago
- ☆45Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆16Updated 2 years ago
- ☆49Updated last year
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆27Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆51Updated 2 years ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆54Updated last year
- The official implementation for Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free☆40Updated 3 weeks ago
- ☆31Updated last year
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆17Updated last year
- ☆32Updated 4 months ago
- ☆28Updated 3 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆27Updated 3 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆23Updated last month