☆20Oct 31, 2022Updated 3 years ago
Alternatives and similar repositories for EvoMoE
Users that are interested in EvoMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆45Feb 28, 2026Updated last month
- ☆19Sep 15, 2022Updated 3 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆64Oct 7, 2021Updated 4 years ago
- Code and Results for the paper: A Revisiting Study of Appropriate Offline Evaluation for Top-𝑁 Recommendation Algorithms.☆11Mar 10, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆11Dec 21, 2022Updated 3 years ago
- Code for 'Diff-MSR: A Diffusion Model Enhanced Paradigm for Cold-Start Multi-Scenario Recommendation' accepted to WSDM 2024☆13Aug 1, 2025Updated 8 months ago
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Jun 7, 2023Updated 2 years ago
- ☆15Jun 15, 2025Updated 9 months ago
- code for paper Sparse Structure Search for Delta Tuning☆11Oct 16, 2022Updated 3 years ago
- Implementation of AlphaZero in PyTorch.☆10Apr 19, 2019Updated 6 years ago
- ☆17Dec 9, 2022Updated 3 years ago
- Compression for Foundation Models☆35Jul 21, 2025Updated 8 months ago
- OCR Engine☆17Dec 31, 2021Updated 4 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [Findings of EMNLP 2024] AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models☆20Oct 2, 2024Updated last year
- ☆27Feb 26, 2023Updated 3 years ago
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆12Feb 28, 2026Updated last month
- Mixture of Attention Heads☆52Oct 10, 2022Updated 3 years ago
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆39May 28, 2024Updated last year
- This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).☆114May 2, 2022Updated 3 years ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- ☆21Dec 8, 2022Updated 3 years ago
- ☆18May 26, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆12Sep 29, 2019Updated 6 years ago
- ☆10Apr 2, 2024Updated 2 years ago
- 很久以前写的人生倒计时工具,由于博客内无法运行,拿出来☆11Jun 9, 2022Updated 3 years ago
- pytorch implementation for "Variational Autoencoder with Implicit Optimal Priors".☆11Oct 12, 2020Updated 5 years ago
- RankFormer: Listwise Learning-to-Rank Using Listwide Labels (KDD 2023).☆27Sep 12, 2023Updated 2 years ago
- SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation (AAAI24)☆25Jul 2, 2024Updated last year
- Code and Model for NeurIPS 2024 Spotlight Paper "Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training…☆44Oct 16, 2024Updated last year
- Sentiment analysis meets music☆11Nov 23, 2014Updated 11 years ago
- [ACL'25 Main] Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs☆41May 26, 2025Updated 10 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆59Jan 14, 2022Updated 4 years ago
- 京东薅羊毛脚本,自动签到,做任务等docker一键启动。有使用上的问题可以加qq群644989387交流。【以上内容为原作者说明】☆10Feb 8, 2022Updated 4 years ago
- ☆29Apr 22, 2024Updated last year
- ☆11Sep 26, 2022Updated 3 years ago
- ☆12Oct 17, 2022Updated 3 years ago
- AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning (Zhou et al.; TACL 2024)☆51Mar 17, 2024Updated 2 years ago
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago