LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆61Feb 21, 2022Updated 4 years ago
Alternatives and similar repositories for SmallInitEmb
Users that are interested in SmallInitEmb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- High-performance tokenized language data-loader for Python C++ extension☆14Jul 22, 2024Updated last year
- Efficient optimizers☆310Updated this week
- ☆23Jul 30, 2025Updated 8 months ago
- ☆54Jul 16, 2025Updated 8 months ago
- ☆22Nov 9, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- High-performance, semantic turn detection for conversational AI☆36Oct 1, 2025Updated 6 months ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Mar 8, 2026Updated last month
- RWKV, in easy to read code☆73Mar 25, 2025Updated last year
- PathPiece tokenizer☆14Nov 10, 2024Updated last year
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 10 months ago
- gzip Predicts Data-dependent Scaling Laws☆35May 28, 2024Updated last year
- ☆23Aug 26, 2023Updated 2 years ago
- Implementation of Kronecker Attention in Pytorch☆19Sep 12, 2020Updated 5 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Implementation of LogAvgExp for Pytorch☆37Apr 10, 2025Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 3 years ago
- Code for Zero-Shot Tokenizer Transfer☆144Jan 14, 2025Updated last year
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 2 years ago
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 10 months ago
- JAX/Flax implementation of the Hyena Hierarchy☆34Apr 27, 2023Updated 2 years ago
- Triton Implementation of HyperAttention Algorithm☆48Dec 11, 2023Updated 2 years ago
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluation☆62Mar 12, 2026Updated 3 weeks ago
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆119Sep 22, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Transformers at any scale☆42Jan 18, 2024Updated 2 years ago
- [ICASSP'26] Real-time streaming voice anonymization & voice conversion☆64Mar 16, 2026Updated 3 weeks ago
- Course Project for COMP4471 on RWKV☆17Feb 11, 2024Updated 2 years ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 11 months ago
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 6 months ago
- PyTorch implementation of simplified neural source filter model (s-nsf)☆14Aug 4, 2021Updated 4 years ago
- utilities for loading and running text embeddings with onnx☆45Aug 16, 2025Updated 7 months ago
- More than Just Words: Modeling Non-textual Characteristics of Podcasts☆26Nov 6, 2019Updated 6 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆92Jul 5, 2024Updated last year
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆49Mar 19, 2026Updated 3 weeks ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28May 4, 2025Updated 11 months ago
- ☆44Sep 19, 2024Updated last year
- Memory-efficient transformer. Work in progress.☆19Sep 17, 2022Updated 3 years ago
- A simple Kanban board built with HTML, CSS and JavaScript☆17Jun 14, 2023Updated 2 years ago
- Training code for Sparse Autoencoders on Embedding models☆39Updated this week