A place to store reusable transformer components of my own creation or found on the interwebs
☆77May 22, 2026Updated this week
Alternatives and similar repositories for transformer_nuggets
Users that are interested in transformer_nuggets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Mixtral finetuning☆19Feb 2, 2024Updated 2 years ago
- Manage ML configuration with pydantic☆16Mar 18, 2026Updated 2 months ago
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 8 months ago
- This repository contains the experimental PyTorch native float8 training UX☆226Aug 1, 2024Updated last year
- extensible collectives library in triton☆98Mar 31, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Personal solutions to the Triton Puzzles☆21Jul 18, 2024Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated 2 years ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197May 6, 2024Updated 2 years ago
- ☆178Feb 3, 2024Updated 2 years ago
- ☆93Jul 5, 2024Updated last year
- ☆17Jul 28, 2023Updated 2 years ago
- Hugging Face Jobs☆20Jul 11, 2025Updated 10 months ago
- ☆29Jan 17, 2025Updated last year
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Solidity contracts for the decentralized Prime Network protocol☆26Jul 6, 2025Updated 10 months ago
- An unofficial jax/haiku implementation of Crystal Graph Convolutional Neural Networks (CGCNN)☆10Dec 17, 2022Updated 3 years ago
- [ICLR 2026] Autoregressive Image Generation with Randomized Parallel Decoding☆90Feb 16, 2026Updated 3 months ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- GPTQ inference Triton kernel☆322May 18, 2023Updated 3 years ago
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆26Apr 20, 2023Updated 3 years ago
- ☆20Jul 12, 2023Updated 2 years ago
- ☆607Aug 23, 2024Updated last year
- A Quirky Assortment of CuTe Kernels☆985Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Odysseus: Playground of LLM Sequence Parallelism☆78Jun 17, 2024Updated last year
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆129Jul 13, 2024Updated last year
- Jupyter Notebook corresponding to 'Going with the Flow: An Introduction to Normalizing Flows'☆27Apr 22, 2021Updated 5 years ago
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024☆186Apr 16, 2024Updated 2 years ago
- Monitor parameter and gradient statistics during neural network training with Chainer☆13Jan 24, 2017Updated 9 years ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Sep 26, 2023Updated 2 years ago
- Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops☆30Mar 16, 2024Updated 2 years ago
- No more certifi! System trust store at hand. In Pure Python.☆24May 10, 2026Updated 2 weeks ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- Fast AI Practical Deep Learning for Coders experiments in Stable Diffusion☆24Nov 10, 2022Updated 3 years ago
- Make triton easier☆50Jun 12, 2024Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆28Apr 21, 2023Updated 3 years ago
- Annotated version of the Mamba paper☆501Feb 27, 2024Updated 2 years ago