waltonfuture / Diff-eRank
Code for https://arxiv.org/abs/2401.17139 (NeurIPS 2024)
β22Updated this week
Related projects β
Alternatives and complementary repositories for Diff-eRank
- [NeurIPS-2024] π Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623β67Updated last month
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"β30Updated last month
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":β34Updated 7 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"β32Updated 10 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"β95Updated 7 months ago
- β34Updated 10 months ago
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"β38Updated 3 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"β14Updated 5 months ago
- β15Updated 4 months ago
- PyTorch implementation of StableMask (ICML'24)β12Updated 4 months ago
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical Viewβ27Updated 3 weeks ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Mergingβ30Updated 3 weeks ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"β43Updated last week
- A Survey on the Honesty of Large Language Modelsβ44Updated last month
- β31Updated last year
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMsβ22Updated last month
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Modelsβ47Updated 2 months ago
- β17Updated 4 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)β45Updated 7 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformersβ68Updated 3 weeks ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.β69Updated 8 months ago
- β27Updated last year
- β15Updated 3 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswalβ¦β44Updated last year
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$β29Updated 2 weeks ago
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRAβ21Updated 7 months ago
- β37Updated 5 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.β48Updated 6 months ago
- This the implementation of LeCoβ27Updated 3 months ago