A MoE impl for PyTorch, [ATC'23] SmartMoE
☆72Jul 11, 2023Updated 2 years ago
Alternatives and similar repositories for SmartMoE
Users that are interested in SmartMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ATC23 AE☆45May 11, 2023Updated 2 years ago
- ☆22Mar 2, 2025Updated last year
- A fast MoE impl for PyTorch☆1,847Feb 10, 2025Updated last year
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago
- Distributed IO-aware Attention algorithm☆24Sep 24, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆20Aug 14, 2025Updated 8 months ago
- ☆89Apr 2, 2022Updated 4 years ago
- Keyformer proposes KV Cache reduction through key tokens identification and without the need for fine-tuning☆57Mar 26, 2024Updated 2 years ago
- PyTorch implementation of Language model compression with weighted low-rank factorization☆13Jun 28, 2023Updated 2 years ago
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆479Mar 15, 2024Updated 2 years ago
- ☆275Oct 31, 2023Updated 2 years ago
- ☆59Feb 11, 2026Updated 2 months ago
- ☆11Mar 23, 2022Updated 4 years ago
- ☆17Oct 12, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆13Jan 22, 2025Updated last year
- PerFlow-AI is a programmable performance analysis, modeling, prediction tool for AI system.☆32Apr 1, 2026Updated 2 weeks ago
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆213Sep 21, 2024Updated last year
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆1,000Dec 6, 2024Updated last year
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆380Jul 10, 2025Updated 9 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆51Oct 31, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆39Aug 27, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,675Mar 8, 2024Updated 2 years ago
- [ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Y…☆34Mar 15, 2023Updated 3 years ago
- ☆10Apr 29, 2023Updated 2 years ago
- GPU-accelerated vector query processing system that supports large vector datasets beyond GPU memory.☆41Mar 24, 2024Updated 2 years ago
- ☆15Sep 24, 2023Updated 2 years ago
- Large Multimodal Model☆15Apr 8, 2024Updated 2 years ago
- A parallelism VAE avoids OOM for high resolution image generation☆89Mar 12, 2026Updated last month
- The first 100B protein language model from biomap☆23Mar 17, 2025Updated last year
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆182Jul 10, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking☆17Oct 26, 2023Updated 2 years ago
- MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning☆12Apr 26, 2021Updated 4 years ago
- Traffic Prediction in PaddlePaddle (ASC17 Deep Learning Application)☆17Apr 27, 2017Updated 8 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- ☆310Jul 10, 2025Updated 9 months ago
- [ACL 2026] Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆49Apr 6, 2026Updated last week
- A library for mechanistic anomaly detection☆22Jan 9, 2025Updated last year