giangdip2410/HyperRouter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/giangdip2410/HyperRouter)

giangdip2410 / HyperRouter

Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"

☆33

Alternatives and similar repositories for HyperRouter

Users that are interested in HyperRouter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

google-deepmind / asyncdiloco
View on GitHub
☆51Jan 18, 2024Updated 2 years ago
Lucky-Lance / Expert_Sparsity
View on GitHub
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
☆123May 24, 2024Updated 2 years ago
ysh-1998 / CoWPiRec
View on GitHub
The official implementation for Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.
☆25Jan 30, 2024Updated 2 years ago
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated 4 months ago
YuanchenBei / MacGNN
View on GitHub
The source code of MacGNN, The Web Conference 2024.
☆57May 28, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
lunyiliu / CoachLM
View on GitHub
Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.
☆60Mar 20, 2024Updated 2 years ago
sade-adrien / SteloCoder
View on GitHub
☆16Dec 21, 2023Updated 2 years ago
Hannibal046 / nanoColBERT
View on GitHub
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆83Mar 18, 2024Updated 2 years ago
AAAI-DISIM-UnivAQ / DALI
View on GitHub
DALI Multi Agent System Framework
☆43Mar 24, 2026Updated 3 months ago
htqin / GoogleBard-VisUnderstand
View on GitHub
How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges
☆30Sep 24, 2023Updated 2 years ago
Raincleared-Song / ConPET
View on GitHub
Source code for a LoRA-based continual relation extraction method.
☆14Sep 25, 2023Updated 2 years ago
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
ahmetustun / hyperx
View on GitHub
☆21Dec 5, 2022Updated 3 years ago
TencentARC / ViSFT
View on GitHub
☆38Jan 20, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
thomasj02 / AiFilter
View on GitHub
Local LLM-based social network filter
☆73Jan 31, 2024Updated 2 years ago
git-disl / Virus
View on GitHub
This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"
☆56Feb 2, 2025Updated last year
XiaoduoAILab / XmodelLM
View on GitHub
XmodelLM
☆38Nov 19, 2024Updated last year
phquang / Continual-Normalization
View on GitHub
☆14Sep 7, 2022Updated 3 years ago
BunsenFeng / FactKB
View on GitHub
Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.
☆20Dec 25, 2023Updated 2 years ago
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
rosewang2008 / backtracing
View on GitHub
Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.
☆91Jul 21, 2024Updated 2 years ago
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
architsharma97 / dpo-rlaif
View on GitHub
☆100Jun 27, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
BinhMisfit / vietnamese-punctuation-prediction
View on GitHub
This repository is used to publish our codes for the conference paper "Vietnamese punctuation prediction using deep neural networks" at S…
☆11Jul 11, 2020Updated 6 years ago
sunsmarterjie / ChatterBox
View on GitHub
[AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues
☆61May 2, 2025Updated last year
zhangzjn / EMOv2
View on GitHub
[T-PAMI 2025] EMOv2: Pushing 5M Vision Model Frontier
☆54Dec 30, 2024Updated last year
ShuyangUni / drl_exposure_ctrl
View on GitHub
☆48Jul 9, 2026Updated 2 weeks ago
Senwang98 / MonoSKD
View on GitHub
[ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient
☆32Dec 8, 2023Updated 2 years ago
Evocargo / Lidar-Annotation-is-All-You-Need
View on GitHub
2D road segmentation using lidar data during training
☆43Dec 21, 2023Updated 2 years ago
AIFEG / BenchLMM
View on GitHub
[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
☆86Aug 19, 2024Updated last year
jxiw / BiGS
View on GitHub
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆119Mar 16, 2024Updated 2 years ago
OFA-Sys / DiverseEvol
View on GitHub
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
☆88Dec 14, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hetailang / SqueezeAttention
View on GitHub
☆37Oct 10, 2024Updated last year
EvanZhuang / MetaTree
View on GitHub
Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers
☆115Sep 13, 2024Updated last year
MediaBrain-SJTU / ECGAD
View on GitHub
[MICCAI2023 Early Accept] Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection
☆69Nov 15, 2024Updated last year
zorazrw / filco
View on GitHub
[Preprint] Learning to Filter Context for Retrieval-Augmented Generaton
☆198Apr 6, 2024Updated 2 years ago
hpcaitech / GPT-Demo
View on GitHub
GPT Demo with hybrid distributed training
☆10Dec 1, 2022Updated 3 years ago
RobertCsordas / moe_layer
View on GitHub
sigma-MoE layer
☆21Jan 5, 2024Updated 2 years ago
Nikunj-Gupta / Efficient_ResNets
View on GitHub
A Residual Network Design with less than 5 million trainable parameters achieving an accuracy of 96.04% on CIFAR-10.
☆27Jul 23, 2024Updated 2 years ago