hdmquan / torch_activationLinks
Torch-activation, a library of activation functions for PyTorch library
☆26Updated 3 months ago
Alternatives and similar repositories for torch_activation
Users that are interested in torch_activation are comparing it to the libraries listed below
Sorting:
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆101Updated 7 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆131Updated last week
- ☆134Updated 11 months ago
- RWKV-7: Surpassing GPT☆94Updated 8 months ago
- ☆38Updated last year
- ☆81Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 5 months ago
- Exploration into the proposed architecture from Sapient Intelligence of Singapore 🇸🇬☆47Updated last week
- ☆57Updated last month
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Updated 2 weeks ago
- ☆64Updated last month
- An implementation of PSGD Kron second-order optimizer for PyTorch☆94Updated 2 weeks ago
- ☆48Updated 6 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆104Updated 3 months ago
- ☆49Updated last year
- Implementation of mamba with rust☆88Updated last year
- σ-GPT: A New Approach to Autoregressive Models☆67Updated 11 months ago
- Train an adapter for any embedding model in under a minute☆110Updated 4 months ago
- ☆100Updated 2 weeks ago
- Lego for GRPO☆28Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Updated last year
- ☆53Updated 9 months ago
- An introduction to LLM Sampling☆79Updated 7 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 3 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆33Updated 5 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- Simple GRPO scripts and configurations.☆59Updated 6 months ago
- ☆56Updated 3 months ago