Anonymous1252022/Megatron-DeepSpeed

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Anonymous1252022/Megatron-DeepSpeed)

Anonymous1252022 / Megatron-DeepSpeed

☆18

Alternatives and similar repositories for Megatron-DeepSpeed

Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

akhilkedia / TranformersGetStable
View on GitHub
[ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"
☆11Jul 19, 2024Updated 2 years ago
FreedomIntelligence / S2S-Arena
View on GitHub
☆21Jun 4, 2026Updated last month
SmerkyG / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆16Dec 9, 2025Updated 7 months ago
Anonymous1252022 / fp4-all-the-way
View on GitHub
☆52May 20, 2025Updated last year
amodaresi / MemLLM
View on GitHub
☆13Aug 13, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
RUCBM / ICLEval
View on GitHub
☆14Jun 24, 2024Updated 2 years ago
Sanster / padding_free_llm_train
View on GitHub
☆16Feb 6, 2024Updated 2 years ago
TianjinYellow / SPAM-Optimizer
View on GitHub
☆36Mar 12, 2025Updated last year
bohanzhuang / Towards-Effective-Low-bitwidth-Convolutional-Neural-Networks
View on GitHub
This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"
☆20Aug 30, 2021Updated 4 years ago
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
FreedomIntelligence / MTalk-Bench
View on GitHub
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
☆20Nov 19, 2025Updated 8 months ago
IST-DASLab / Quartet
View on GitHub
☆127Mar 18, 2026Updated 4 months ago
phuocphn / uniq
View on GitHub
Pytorch implementation of our UniQ method, IEEE Access -- Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric …
☆11Apr 7, 2021Updated 5 years ago
tim-lawson / skip-middle
View on GitHub
Learning to Skip the Middle Layers of Transformers
☆17Aug 7, 2025Updated 11 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
IST-DASLab / QuEST
View on GitHub
Work in progress.
☆80Nov 25, 2025Updated 7 months ago
wlfeng0509 / Q-VDiT
View on GitHub
(ICML-2025) Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers
☆21Aug 13, 2025Updated 11 months ago
2187Nick / ADAS
View on GitHub
Automated Design of Agentic Systems
☆10Sep 7, 2024Updated last year
GAIR-NLP / LIMOPro
View on GitHub
☆15May 27, 2025Updated last year
psunlpgroup / VisOnlyQA
View on GitHub
This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…
☆29Jul 9, 2025Updated last year
teamforus / Forus
View on GitHub
Platform Forus is an Open SaaS solution that facilitates the management and issuance of social benefits, offering a collaborative platfor…
☆10Jul 9, 2026Updated last week
linkedin / ControlLLM
View on GitHub
Control LLM
☆23Apr 6, 2025Updated last year
hhnqqq / GemmaLongText
View on GitHub
☆15Apr 7, 2024Updated 2 years ago
IST-DASLab / MatGPTQ
View on GitHub
Code for MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization
☆22Feb 18, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
real-absolute-AI / RAPID
View on GitHub
[ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding
☆23Mar 2, 2025Updated last year
martin-marek / batch-size
View on GitHub
📄Small Batch Size Training for Language Models
☆82Mar 18, 2026Updated 4 months ago
RUCAIBox / MPOP
View on GitHub
☆13Jun 16, 2021Updated 5 years ago
choidami / inductive-oocr
View on GitHub
☆16Mar 22, 2025Updated last year
EastTower16 / LLMDataDistill
View on GitHub
distill large scale web page text
☆12Jul 29, 2023Updated 2 years ago
NyanCatTW1 / RedMetaClassAnalyzer
View on GitHub
Does all kind of cool stuff to make analyzing meta classes easier. Now featuring WRedLogger.py, the previous backend of NetDbg
☆10Jun 7, 2023Updated 3 years ago
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
VITA-Group / TAPE
View on GitHub
[ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…
☆15Jun 6, 2025Updated last year
IST-DASLab / FP-Quant
View on GitHub
☆114Feb 26, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Dao-AILab / grouped-latent-attention
View on GitHub
☆135May 29, 2025Updated last year
XueruiSu / Trust-Region-Preference-Approximation
View on GitHub
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
☆15Jun 28, 2025Updated last year
DualityRL / multi-attempt
View on GitHub
☆19Mar 10, 2025Updated last year
yuzhaouoe / pretraining-data-packing
View on GitHub
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆24Aug 18, 2024Updated last year
Snektron / exaregex
View on GitHub
Zig regex experiment
☆13Nov 6, 2025Updated 8 months ago
DSL-Lab / aops
View on GitHub
☆43Feb 7, 2025Updated last year
thomasahle / kanmlps
View on GitHub
KANs and MLPs
☆12Jun 7, 2024Updated 2 years ago