Bumble666 / Hyper_MoELinks

☆33

Alternatives and similar repositories for Hyper_MoE

Users that are interested in Hyper_MoE are comparing it to the libraries listed below

Sorting:

GCYZSL / MoLA
☆161Updated last year
yushuiwx / Mixture-of-LoRA-Experts
☆54Updated 10 months ago
ChasonShi / MELoRA
code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"
☆32Updated 8 months ago
gyhdog99 / MoCLE
MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)
☆44Updated 3 months ago
circle-hit / SAPT
Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …
☆36Updated 9 months ago
hkust-nlp / PEM_composition
[NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"
☆61Updated last year
TsinghuaC3I / SoRA
[EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models
☆83Updated last year
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆82Updated 11 months ago
liuqidong07 / MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
☆184Updated last year
LightChen233 / M3CoT
☆84Updated last year
RenShuhuai-Andy / my-tools
my commonly-used tools
☆62Updated 9 months ago
zzz47zzz / spurious-forgetting
[ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"
☆54Updated 5 months ago
xiaomi-research / colar
[NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
☆57Updated 2 months ago
Vance0124 / Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
☆148Updated 8 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆129Updated 7 months ago
xuyige / SoftCoT
ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…
☆56Updated 4 months ago
BeyonderXX / TRACE
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
☆79Updated last year
cmnfriend / O-LoRA
☆186Updated last year
wutaiqiang / MoSLoRA
☆120Updated last year
AGI-Edgerunners / LLM-Continual-Learning-Papers
Must-read Papers on Large Language Model (LLM) Continual Learning
☆146Updated last year
Clin0212 / HydraLoRA
[NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
☆227Updated 10 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
ShadeCloak / ADORA
☆46Updated 6 months ago
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆61Updated 10 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆69Updated 3 months ago
chuanyang-Zheng / DAPE
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆39Updated last year
zefang-liu / AdaMoLE
AdaMoLE: Adaptive Mixture of LoRA Experts
☆37Updated last year
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆54Updated 11 months ago
ShiZhengyan / DePT
[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"
☆97Updated last year
OpenMOSS / Say-I-Dont-Know
[ICML'2024] Can AI Assistants Know What They Don't Know?
☆83Updated last year