main-horse / hnetLinks

H-Net Dynamic Hierarchical Architecture

☆79

Alternatives and similar repositories for hnet

Users that are interested in hnet are comparing it to the libraries listed below

Sorting:

VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆71Updated 4 months ago
ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆84Updated 9 months ago
dvruette / barrel-rec-pytorch
☆53Updated last year
evanatyourservice / llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Updated last month
ethansmith2000 / TransformerExperiments
☆19Updated 4 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆105Updated 6 months ago
ClashLuke / SOAP
☆21Updated 10 months ago
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆190Updated 9 months ago
fal-ai-community / nano-mdm
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆56Updated 6 months ago
zaydzuhri / softpick-attention
Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"
☆84Updated last week
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 4 months ago
kyleliang919 / Super_Muon
☆64Updated 6 months ago
fal-ai-community / NativeSparseAttention
research impl of Native Sparse Attention (2502.11089)
☆61Updated 7 months ago
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆129Updated 9 months ago
martin-marek / batch-size
📄Small Batch Size Training for Language Models
☆60Updated 3 weeks ago
epfml / DenseFormer
☆82Updated last year
okarthikb / state-space-models
☆28Updated last year
cloneofsimo / min-fsdp
☆88Updated last year
RobertCsordas / moeut
☆85Updated last year
RWKV / ZeroCoT
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28Updated 4 months ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Updated last year
edwardmilsom / function-space-learning-rates-paper
Code for the paper "Function-Space Learning Rates"
☆23Updated 3 months ago
apple / ml-ademamix
☆67Updated 10 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆160Updated 2 months ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆95Updated 10 months ago
dayal-kalra / low-memory-adam
☆12Updated 6 months ago
euclaise / supertrainer2000
☆49Updated last year
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆128Updated last year
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆109Updated 4 months ago
cloneofsimo / zeroshampoo
☆34Updated last year