RobertCsordas/switchhead

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RobertCsordas/switchhead)

RobertCsordas / switchhead

☆16

Alternatives and similar repositories for switchhead

Users that are interested in switchhead are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RobertCsordas / moe_attention
View on GitHub
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆101Sep 30, 2024Updated last year
AkideLiu / MiniCache
View on GitHub
☆14Sep 7, 2024Updated last year
RobertCsordas / moe
View on GitHub
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆39Jun 11, 2025Updated last year
ysngki / UMoE
View on GitHub
☆23Oct 22, 2025Updated 9 months ago
abdelfattah-lab / shadow_llm
View on GitHub
☆11Sep 20, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
temp3rr0r / CellularAutomataEpidemicModels
View on GitHub
Stochastic Cellular Automata epidemic models in Python with 2D simulations
☆15Feb 24, 2020Updated 6 years ago
Altaheri / MI-EEG-Datasets
View on GitHub
Public EEG-based motor imagery (MI) datasets
☆13Feb 1, 2024Updated 2 years ago
ysngki / XMoE
View on GitHub
☆15Oct 19, 2024Updated last year
changwoolee / BLAST
View on GitHub
[NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference
☆18Nov 6, 2024Updated last year
mathilde-b / SRDA_Miccai
View on GitHub
Source Free Domain Adaptation
☆10Aug 27, 2021Updated 4 years ago
SCUT-IEL / STAnet
View on GitHub
This repository contains the python scripts developed as a part of the work presented in the paper "STAnet: A Spatiotemporal Attention Ne…
☆15May 10, 2023Updated 3 years ago
VITA-Group / Random-MoE-as-Dropout
View on GitHub
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆56Feb 28, 2023Updated 3 years ago
Infini-AI-Lab / Sirius
View on GitHub
Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its…
☆21Sep 10, 2024Updated last year
traveler-framework / TraveLER
View on GitHub
[EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering
☆18Oct 31, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
MikaStars39 / StableMask
View on GitHub
PyTorch implementation of StableMask (ICML'24)
☆15Jun 27, 2024Updated 2 years ago
GATECH-EIC / Castling-ViT
View on GitHub
[CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
☆31Mar 14, 2024Updated 2 years ago
MedMaxLab / eegprepro
View on GitHub
Evaluating the role of EEG preprocessing for deep learning applications.
☆16Mar 10, 2025Updated last year
ljbuaa / VisualDecoding
View on GitHub
☆17May 18, 2023Updated 3 years ago
GoJunHyeong / SpatialBias
View on GitHub
☆10Dec 13, 2022Updated 3 years ago
yikangshen / MoA
View on GitHub
Mixture of Attention Heads
☆53Oct 10, 2022Updated 3 years ago
gmentz / seegnificant
View on GitHub
Codebase for publication "Neural decoding from stereotactic EEG: accounting for electrode variability across subjects" @ NeurIPS (2024)
☆19Jun 11, 2025Updated last year
zihuanqiu / MINGLE
View on GitHub
The code repository for "MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging"(NeurIPS25) in PyTorc…
☆15Jun 2, 2026Updated last month
roymiles / VeLoRA
View on GitHub
[NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections
☆22Oct 15, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
konglk1203 / VariationalStiefelOptimizer
View on GitHub
Implementation for the paper 'Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport' (ICL…
☆20Jan 1, 2025Updated last year
RobertCsordas / moeut
View on GitHub
☆93Aug 18, 2024Updated last year
DavidFanzz / llm_decoding
View on GitHub
☆12Apr 25, 2025Updated last year
ranggihwang / Pregated_MoE
View on GitHub
☆62May 4, 2024Updated 2 years ago
tom-doerr / awesome-dspy
View on GitHub
☆28May 15, 2024Updated 2 years ago
BenyaminHaghi / FENet
View on GitHub
☆27Mar 26, 2025Updated last year
glassroom / heinsen_attention
View on GitHub
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
☆25Jun 6, 2024Updated 2 years ago
OpenSparseLLMs / CLIP-MoE
View on GitHub
CLIP-MoE: Mixture of Experts for CLIP
☆58Oct 10, 2024Updated last year
shawntan / scattermoe
View on GitHub
Triton-based implementation of Sparse Mixture of Experts.
☆281Oct 3, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
xuxiran / ASAD_DenseNet
View on GitHub
the implementation of the ASAD_DenseNet
☆31Mar 24, 2025Updated last year
saic-fi / SSLight
View on GitHub
[ICLR'23] Effective Self-supervised Pre-training on Low-compute networks without Distillation
☆18Oct 9, 2024Updated last year
ECoLab-POSTECH / NIPQ
View on GitHub
☆18Jul 1, 2023Updated 3 years ago
phucty / wtabhtml
View on GitHub
Tool to parse wiki tables from the HTML dump of Wikipedia
☆11Jun 12, 2022Updated 4 years ago
xuesong39 / DAC
View on GitHub
[CVPR 2024] Official implementation of CVPR 2024 paper: "Doubly Abductive Counterfactual Inference for Text-based Image Editing"
☆26Mar 8, 2024Updated 2 years ago
WongiPark0628 / RAL
View on GitHub
[ICCVW'23] Robust Asymmetric Loss for Multi-Label Long-Tailed Learning
☆19Oct 3, 2023Updated 2 years ago
nyonicai / nyonic-public
View on GitHub
Reference implementation of models from Nyonic Model Factory
☆12May 13, 2024Updated 2 years ago