AdamG012/moe-paper-models

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AdamG012/moe-paper-models)

AdamG012 / moe-paper-models

A sumary of MoE experimental setups across a number of different papers.

☆16

Alternatives and similar repositories for moe-paper-models

Users that are interested in moe-paper-models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lsj2408 / URPE
View on GitHub
[NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)
☆35Aug 6, 2023Updated 2 years ago
apple / ml-vfi-smiff
View on GitHub
☆14Nov 5, 2025Updated 8 months ago
ghadiaravi13 / Untied-Ulysses
View on GitHub
☆24May 23, 2026Updated 2 months ago
bigcode-project / opt-out-v2
View on GitHub
Repository for opt-out requests.
☆10Mar 25, 2024Updated 2 years ago
bicici / FDA
View on GitHub
Feature Decay Algorithms
☆11Mar 5, 2014Updated 12 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
kelechi-c / dit_flow
View on GitHub
DiT (training + flow matching) in Jax
☆12Jan 5, 2025Updated last year
bigcode-project / pii-lib
View on GitHub
Code for PII detection and redaction in code datasets
☆15Jan 24, 2023Updated 3 years ago
yusugomori / dl-book-generative
View on GitHub
詳説ディープラーニング（生成モデル編）
☆11May 15, 2019Updated 7 years ago
AniZpZ / smoothquant
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆11Dec 13, 2023Updated 2 years ago
vaaaaanquish / docker-UTH-BERT
View on GitHub
docker for UTH-BERT: https://ai-health.m.u-tokyo.ac.jp/uth-bert
☆14Mar 24, 2023Updated 3 years ago
RobertBiehl / multimodal-instruct
View on GitHub
Instruction tuning dataset generation inspired by LLaVA-Instruct-158k via any LLM, also for commercial use.
☆13Mar 13, 2024Updated 2 years ago
Machine-Learning-Tokyo / Agritech
View on GitHub
☆12Oct 22, 2019Updated 6 years ago
thevasudevgupta / transformers-adapters
View on GitHub
This repositary hosts my experiments for the project, I did with OffNote Labs.
☆10Apr 12, 2021Updated 5 years ago
1a3orn / very-simple-moe
View on GitHub
Extremely simple MoE implementation, mostly based off Switch Transformer
☆13Feb 26, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
p1ass / kuee-thesis-markdown
View on GitHub
京大工学部電気電子工学科の卒論をマークダウンで書くための一式
☆12Jan 13, 2021Updated 5 years ago
zbh2047 / clipping-algorithms
View on GitHub
Code for the NeurIPS 2020 paper "Improved analysis of clippind algorithms for non-convex optimization", including various clipping algori…
☆10Feb 17, 2021Updated 5 years ago
nodewee / Typing-faster-on-macOS
View on GitHub
macOS 提高输入效率：输入法文本替换列表、Alfred Snippets
☆14Jul 18, 2024Updated 2 years ago
mitkotak / fast_flops
View on GitHub
FLOPS counter for all your GPU benchmarking needs
☆13Aug 8, 2024Updated last year
jquesnelle / sat-reading
View on GitHub
☆18Feb 20, 2023Updated 3 years ago
instance-wise-ordered-transformer / IOT
View on GitHub
☆20Feb 26, 2021Updated 5 years ago
yusanish / docker-jumanpp-knp
View on GitHub
JUMAN++とKNPをDockerで使えるようにする。
☆17Jan 19, 2019Updated 7 years ago
joemasilotti / TailwindCSS-SwiftUI
View on GitHub
TailwindCSS colors for SwiftUI.
☆27Aug 18, 2020Updated 5 years ago
microsoft / factored-segmenter
View on GitHub
Unsupervised factor-based text tokenizer for natural-language processing applications
☆17Jul 24, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
josephrocca / ChatVRM-js
View on GitHub
A JS conversion/adaptation of parts of the ChatVRM (TypeScript) code for standalone use in OpenCharacters and elsewhere
☆21Jul 30, 2023Updated 2 years ago
mt-upc / logit-explanations
View on GitHub
☆18Jun 19, 2023Updated 3 years ago
deep-spin / lmt_hallucinations
View on GitHub
☆19Jun 13, 2023Updated 3 years ago
yanshanjing / learning-from-imbalanced-classes
View on GitHub
Learning From Imbalanced Classes
☆14Aug 25, 2016Updated 9 years ago
yang-zhang / labse-pytorch
View on GitHub
Language-agnostic BERT Sentence Embedding (LaBSE) Pytorch Model
☆21Sep 2, 2020Updated 5 years ago
enakai00 / colab_jaxbook
View on GitHub
Colab Notebooks for JAX/Flax/Optax ML Book
☆12Feb 25, 2023Updated 3 years ago
InfrHQ / Replay
View on GitHub
An Infr app that helps you replay & talk to everything you've ever seen.
☆15Sep 19, 2023Updated 2 years ago
mingruimingrui / fast-mosestokenizer
View on GitHub
c++ mosestokenizer
☆18Mar 13, 2024Updated 2 years ago
yusuke84 / webrtc-handson-2016
View on GitHub
☆16Feb 19, 2019Updated 7 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Rimkomatic / Re-Nvim
View on GitHub
Re-Nvim is a fast, modular, and highly customizable Neovim configuration designed to optimize your development environment.
☆12Mar 27, 2025Updated last year
Oztobuzz / Vista
View on GitHub
This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…
☆26May 14, 2024Updated 2 years ago
janhq / model-converter
View on GitHub
☆23Dec 14, 2023Updated 2 years ago
Physical-Intelligence / pi-data-sharing
View on GitHub
☆21Jun 22, 2026Updated last month
ctlllll / gpt-oss-reverse-engineering
View on GitHub
☆72Aug 6, 2025Updated 11 months ago
ngovinhtn / JaViCorpus
View on GitHub
☆16Aug 23, 2022Updated 3 years ago
AudranDoublet / rtx_opr
View on GitHub
Photorealistic Minecraft-like game using NVIDIA RTX in Rust
☆15May 1, 2021Updated 5 years ago