main-horse / hnet-oldView external linksLinks
H-Net Dynamic Hierarchical Architecture
☆81Sep 11, 2025Updated 5 months ago
Alternatives and similar repositories for hnet-old
Users that are interested in hnet-old are comparing it to the libraries listed below
Sorting:
- ☆19Dec 4, 2025Updated 2 months ago
- ☆67Mar 21, 2025Updated 10 months ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 8 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆14May 28, 2025Updated 8 months ago
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Feb 27, 2025Updated 11 months ago
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆129Jun 24, 2025Updated 7 months ago
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- MLX binary vectors and associated algorithms.☆14Mar 13, 2025Updated 11 months ago
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- Schedule free optimiser implemented in JAX using Optimistix☆15May 29, 2024Updated last year
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆47Jun 22, 2024Updated last year
- High quality implementations of imitation and inverse reinforcement learning algorithms☆21Aug 19, 2025Updated 5 months ago
- ROSA+: RWKV's ROSA implementation with fallback statistical predictor☆32Oct 13, 2025Updated 4 months ago
- Fork of Flame repo for training of some new stuff in development☆19Jan 5, 2026Updated last month
- ☆24Dec 11, 2024Updated last year
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆89Oct 30, 2024Updated last year
- MB-X.01 · Logical Origin Node (L.O.N.) — TruthΩ → Co⁺ → Score⁺. Demo e spec verificabili. https://massimiliano.neocities.org/☆54Feb 3, 2026Updated last week
- DeMo: Decoupled Momentum Optimization☆198Dec 2, 2024Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Feb 21, 2022Updated 3 years ago
- Next-gen Foundation Model for Embodied AI☆25Nov 21, 2025Updated 2 months ago
- ☆131May 29, 2025Updated 8 months ago
- Where we keep our notes about model training runs.☆16Mar 12, 2023Updated 2 years ago
- Mapping out the "memory" of neural nets with data attribution☆39Updated this week
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆109Mar 7, 2025Updated 11 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆187Jan 19, 2026Updated 3 weeks ago
- ☆53May 20, 2024Updated last year
- ☆121Feb 4, 2026Updated last week
- Stick-breaking attention☆62Jul 1, 2025Updated 7 months ago
- ☆63Oct 3, 2024Updated last year
- WIP☆93Aug 13, 2024Updated last year
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Nov 6, 2023Updated 2 years ago
- ☆32Dec 2, 2024Updated last year
- Cost aware hyperparameter tuning algorithm☆180Jun 27, 2024Updated last year
- Tools to simplify life with AI☆29Apr 4, 2025Updated 10 months ago