☆20Oct 31, 2022Updated 3 years ago
Alternatives and similar repositories for EvoMoE
Users that are interested in EvoMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆46Feb 28, 2026Updated 2 months ago
- ☆19Sep 15, 2022Updated 3 years ago
- [IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…☆52May 31, 2023Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆64Oct 7, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This repository contains PyTorch implemenation of WWW 2023 research paper: Optimizing Feature Set for Click-through Rate Prediction.☆12Oct 23, 2023Updated 2 years ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Jul 25, 2023Updated 2 years ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆124Dec 18, 2023Updated 2 years ago
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Jun 7, 2023Updated 2 years ago
- GNURadio out-ot-tree (OOT) module for optical wireless communications.☆13May 4, 2026Updated 2 weeks ago
- code for paper Sparse Structure Search for Delta Tuning☆11Oct 16, 2022Updated 3 years ago
- Implementation of AlphaZero in PyTorch.☆10Apr 19, 2019Updated 7 years ago
- [MIDL 2023] Official Imeplementation of "Making Your First Choice: To Address Cold Start Problem in Vision Active Learning"☆36Aug 3, 2023Updated 2 years ago
- Compression for Foundation Models☆35Jul 21, 2025Updated 9 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- This project implements CNN on pytorch for people detection + some vue.js front for live demo☆10Jan 3, 2023Updated 3 years ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training.☆335Dec 13, 2025Updated 5 months ago
- Python library for scientific computing☆18Oct 30, 2019Updated 6 years ago
- [Findings of EMNLP 2024] AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models☆20Oct 2, 2024Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- ☆28Feb 26, 2023Updated 3 years ago
- An OpenAI API compatible images server to generate or manipulate images.☆18Feb 2, 2025Updated last year
- Accelerating Distributed Machine Learning with Data Sketches☆17Nov 12, 2018Updated 7 years ago
- control theory repo☆17Apr 20, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆14Jul 19, 2018Updated 7 years ago
- This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).☆114May 2, 2022Updated 4 years ago
- Examples for MS-AMP package.☆30Jul 17, 2025Updated 10 months ago
- Mixture of Attention Heads☆52Oct 10, 2022Updated 3 years ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you hav…☆24Oct 22, 2025Updated 6 months ago
- ☆11Nov 14, 2021Updated 4 years ago
- Open-source software platform for cognitive radio waveforms☆24Aug 2, 2016Updated 9 years ago
- ☆10Apr 2, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Accommodating Large Language Model Training over Heterogeneous Environment.☆29Mar 13, 2025Updated last year
- Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution☆27Mar 18, 2021Updated 5 years ago
- [TCSS 2026] TF4CTR: Twin Focus Framework for CTR Prediction via Adaptive Sample Differentiation☆17Mar 20, 2026Updated 2 months ago
- [ACL'25 Main] Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs☆42May 26, 2025Updated 11 months ago
- Sentiment analysis meets music☆11Nov 23, 2014Updated 11 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆83Oct 5, 2023Updated 2 years ago
- ☆28Apr 22, 2024Updated 2 years ago