Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆18Dec 19, 2024Updated last year
Alternatives and similar repositories for Megatron-DeepSpeed
Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below
Sorting:
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆14Jan 8, 2026Updated last month
- [PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…☆10Aug 13, 2024Updated last year
- ☆11May 8, 2025Updated 9 months ago
- GATSBI: Generative Adversarial Training for Simulation-Based Inference☆19Jul 13, 2023Updated 2 years ago
- This is a repository with examples to run inference endpoints on various ALCF clusters☆27Feb 3, 2026Updated last month
- ☆21Jun 6, 2024Updated last year
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆27Jun 25, 2024Updated last year
- Revision of official yolov7-pose to support custom dataset for keypoint detection☆11Nov 12, 2023Updated 2 years ago
- [ICML2025] LoRA fine-tune directly on the quantized models.☆39Nov 25, 2024Updated last year
- ☆28Nov 29, 2024Updated last year
- HealthiVert-GAN, a novel deep-learning framework designed to generate pseudo-healthy vertebral images. These images simulate the pre-frac…☆11Nov 3, 2025Updated 4 months ago
- A quarto extension for writing teaching practicals☆12Sep 21, 2025Updated 5 months ago
- [Developmental] Quarto Extension to Enable Google Colaboratory Links with Quarto Documents☆15May 18, 2025Updated 9 months ago
- MATLAB code for Stein Point Markov Chain Monte Carlo.☆13Jul 3, 2019Updated 6 years ago
- Code for the experiments in the ACL 2020 paper "Estimating predictive uncertainty for rumour verification models"☆11May 15, 2020Updated 5 years ago
- Canonical normalizing flows☆10Apr 30, 2019Updated 6 years ago
- This repository is a reimplementation of the paper(BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model: htt…☆11Nov 14, 2019Updated 6 years ago
- Modern normalizing flows in Python. Simple to use and easily extensible.☆12Feb 11, 2026Updated 2 weeks ago
- Multimodal SuperCon: Classifier for Drivers of Deforestation in Indonesia☆10Nov 18, 2023Updated 2 years ago
- Two-stream remote sensing model for water quality mapping: 2SeaColor☆10Feb 2, 2021Updated 5 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 6 months ago
- ☆44Dec 20, 2023Updated 2 years ago
- Code for the paper Normalizing Flows are Capable Models for RL☆18Jun 3, 2025Updated 9 months ago
- ☆10Jun 26, 2023Updated 2 years ago
- ☆11May 20, 2022Updated 3 years ago
- ☆11May 12, 2023Updated 2 years ago
- A Python package designed for research in diffusion-based generative modeling☆31Nov 26, 2025Updated 3 months ago
- ☆10Jun 1, 2022Updated 3 years ago
- Continuously tempered Hamiltonian Monte Carlo☆12Apr 12, 2017Updated 8 years ago
- MIPS I simulator☆20Dec 28, 2018Updated 7 years ago
- Code Llama GGUF Demo☆10Aug 28, 2023Updated 2 years ago
- Matplotlib style sheets allow to stylize plots easily.☆13Jun 10, 2024Updated last year
- Weekly data science workshops for SIG AIDA at UIUC.☆12Jun 22, 2022Updated 3 years ago
- Quarto Extension filter that adds a full screen button in the code blocks in RevealJs slides and html documents.☆14Apr 18, 2023Updated 2 years ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Jan 30, 2026Updated last month
- Exact diagonalization for the Bose Hubbard model in Julia☆11Dec 3, 2019Updated 6 years ago
- VaniDL is an tool for analyzing I/O patterns and behavior with Deep Learning Applications.☆10Jul 8, 2022Updated 3 years ago
- ☆11Feb 25, 2025Updated last year
- Full End-to-End examples showing how to use First-gen Gaudi and Gaudi2 in common use cases☆13Dec 2, 2024Updated last year