imoneoi/bf16_fused_adam

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/imoneoi/bf16_fused_adam)

imoneoi / bf16_fused_adam

BFloat16 Fused Adam Operator for PyTorch

☆20

Alternatives and similar repositories for bf16_fused_adam

Users that are interested in bf16_fused_adam are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

imoneoi / RSP_JAX
View on GitHub
[AAAI'25 Oral] Are Expressive Models Truly Necessary for Offline RL?
☆15Dec 10, 2024Updated last year
imoneoi / mistral-tokenizer
View on GitHub
☆20Apr 1, 2024Updated 2 years ago
imoneoi / EvolvingConnectivity
View on GitHub
Code for paper Evolving Connectivity for Spiking Neural Networks
☆22Oct 23, 2023Updated 2 years ago
lernapparat / torchhacks
View on GitHub
Hacks for PyTorch
☆19Apr 18, 2023Updated 3 years ago
HuyNguyen-hust / hopper-gemm-101
View on GitHub
☆13Dec 22, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yaof20 / DenseMixer
View on GitHub
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
☆68Aug 3, 2025Updated 11 months ago
fattorib / ZeRO-transformer
View on GitHub
Two implementations of ZeRO-1 optimizer sharding in JAX
☆14Jun 11, 2023Updated 3 years ago
huyphan168 / PEER
View on GitHub
Mixture of A Million Experts
☆56Jul 30, 2024Updated last year
kuterd / opal_ptx
View on GitHub
Experimental GPU language with meta-programming
☆31Sep 6, 2024Updated last year
aadityasingh / icl-dynamics
View on GitHub
☆26Feb 20, 2026Updated 5 months ago
AlxSp / t-jepa
View on GitHub
☆12Apr 26, 2024Updated 2 years ago
Farseer-Scaling-Law / Farseer
View on GitHub
☆21Jun 12, 2025Updated last year
HoTT / HoTT-2019
View on GitHub
Conference on Homotopy Type Theory 2019
☆16Sep 18, 2019Updated 6 years ago
jfischoff / svd-inpainting
View on GitHub
An attempt at a SVD inpainting pipeline
☆50Dec 24, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
opooladz / Preconditioned-Stochastic-Gradient-Descent
View on GitHub
A repo based on XiLin Li's PSGD repo that extends some of the experiments.
☆14Oct 7, 2024Updated last year
HuyNguyen-hust / flash-attn-101
View on GitHub
☆22Sep 3, 2024Updated last year
tspeterkim / mixed-precision-from-scratch
View on GitHub
Mixed precision training from scratch with Tensors and CUDA
☆30May 14, 2024Updated 2 years ago
dame-cell / Triformer
View on GitHub
Transformers components but in Triton
☆34May 9, 2025Updated last year
anujanegi / VQA
View on GitHub
Visual Question Answering System
☆11Nov 13, 2019Updated 6 years ago
imoneoi / multipack
View on GitHub
Multipack distributed sampler for fast padding-free training of LLMs
☆207Aug 10, 2024Updated last year
LaurentMazare / tboard-rs
View on GitHub
Read and write tensorboard data using Rust
☆23Feb 4, 2024Updated 2 years ago
meta-pytorch / export-python
View on GitHub
Conveniently export torch.compile compiled products into self-contained Python files
☆34Jun 5, 2026Updated last month
shreyansh26 / Attention-Mask-Patterns
View on GitHub
Using FlexAttention to compute attention with different masking patterns
☆47Sep 22, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MekkCyber / TritonAcademy
View on GitHub
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆208Jun 1, 2025Updated last year
ethansmith2000 / MathRock-Diffusion
View on GitHub
☆16Jan 3, 2023Updated 3 years ago
lilakk / BLEUBERI
View on GitHub
Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"
☆32Jun 5, 2025Updated last year
kthielen / stlcc
View on GitHub
Simply-typed lambda calculus (plus several features) -> x86 executables for Linux and Windows
☆25Jan 25, 2013Updated 13 years ago
kyle8581 / LanguageModelsasCompilers
View on GitHub
Official implementation of Language Models as Compilers: Simulating the Execution Of Pseudocode Improves Algorithmic Reasoning in Languag…
☆23Apr 8, 2024Updated 2 years ago
dnaihao / slurm-cheatsheet
View on GitHub
Cheatsheet for slurm command lines
☆12Mar 6, 2026Updated 4 months ago
TIGER-AI-Lab / VisualWebInstruct
View on GitHub
The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]
☆39Feb 1, 2026Updated 5 months ago
chadbrewbaker / ChadGPT
View on GitHub
☆10Dec 19, 2024Updated last year
LambdaLabsML / llama
View on GitHub
Inference code for LLaMA models
☆42Mar 13, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
brightjade / SimCKP
View on GitHub
Source code for "SimCKP: Simple Contrastive Learning of Keyphrase Representations", Findings of EMNLP 2023
☆12Jun 20, 2025Updated last year
wizzard0 / neural-graphics-from-scratch
View on GitHub
Instant Neural Graphics Primitives from scratch, zero dependencies. Learning by doing.
☆10Aug 18, 2023Updated 2 years ago
vaguenebula / AlpacaDataReflect
View on GitHub
An experiment to see if chatgpt can improve the output of the stanford alpaca dataset
☆12Mar 29, 2023Updated 3 years ago
official-elinas / zeus-llm-trainer
View on GitHub
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆69Aug 27, 2023Updated 2 years ago
gau-nernst / kokoro
View on GitHub
https://hf.co/hexgrad/Kokoro-82M
☆14Jan 14, 2026Updated 6 months ago
mgmalek / efficient_cross_entropy
View on GitHub
☆124May 28, 2024Updated 2 years ago
JINO-ROHIT / nano-paged-attention
View on GitHub
a minimal paged attention implementation
☆20Jan 30, 2026Updated 5 months ago