hdmquan / torch_activation
Torch-activation, a collection of activation functions for PyTorch library
☆24Updated this week
Alternatives and similar repositories for torch_activation:
Users that are interested in torch_activation are comparing it to the libraries listed below
- ☆49Updated last year
- ☆126Updated 7 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Lego for GRPO☆25Updated last week
- QLoRA for Masked Language Modeling☆21Updated last year
- ☆19Updated 7 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆128Updated last month
- entropix style sampling + GUI☆25Updated 5 months ago
- ☆38Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated last month
- EvaByte: Efficient Byte-level Language Models at Scale☆85Updated last week
- ☆48Updated 4 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆87Updated 3 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 11 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Tokun to can tokens☆16Updated last month
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated last month
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆32Updated last month
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆98Updated 3 months ago
- 🤝 Trade any tensors over the network☆30Updated last year
- ☆52Updated 7 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- ☆27Updated 8 months ago
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 5 months ago
- ☆27Updated 4 months ago
- An introduction to LLM Sampling☆77Updated 3 months ago
- Code for ExploreTom☆79Updated 3 months ago
- Collection of autoregressive model implementation☆83Updated last month
- ☆43Updated last year