drisspg / transformer_nuggetsView external linksLinks
A place to store reusable transformer components of my own creation or found on the interwebs
☆72Feb 6, 2026Updated last week
Alternatives and similar repositories for transformer_nuggets
Users that are interested in transformer_nuggets are comparing it to the libraries listed below
Sorting:
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- (unofficial) - customized fork of DETR, optimized for intelligent obj detection on 'real world' custom datasets☆12Aug 22, 2020Updated 5 years ago
- Mixtral finetuning☆19Feb 2, 2024Updated 2 years ago
- An unofficial jax/haiku implementation of Crystal Graph Convolutional Neural Networks (CGCNN)☆10Dec 17, 2022Updated 3 years ago
- ☆177Feb 3, 2024Updated 2 years ago
- This repository contains the experimental PyTorch native float8 training UX☆226Aug 1, 2024Updated last year
- LiteGPT: A 124M Small Language Model (SLM) pre-trained on FineWeb and fine-tuned on Alpaca.☆34Dec 16, 2025Updated last month
- ☆28Jan 17, 2025Updated last year
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆196May 6, 2024Updated last year
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆26Dec 18, 2025Updated last month
- 機械学習モデルの実装だったり色々な手法を試したときのJupyter Notebook☆14Dec 7, 2017Updated 8 years ago
- CLI for Recursive Language Models☆42Jan 28, 2026Updated 2 weeks ago
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated last year
- ☆92Jul 5, 2024Updated last year
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024☆184Apr 16, 2024Updated last year
- GPTQ inference Triton kernel☆321May 18, 2023Updated 2 years ago
- Python binding of primitiv.☆17Sep 12, 2022Updated 3 years ago
- Manage ML configuration with pydantic☆16Jan 25, 2026Updated 2 weeks ago
- Markov Decision Processes in Python☆15Jan 3, 2019Updated 7 years ago
- ☆593Aug 23, 2024Updated last year
- ☆17Jul 28, 2023Updated 2 years ago
- Various transformers for FSDP research☆38Nov 11, 2022Updated 3 years ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- Triton-based Symmetric Memory operators and examples☆81Jan 15, 2026Updated 3 weeks ago
- [ICLR 2026 🔥] Dr.LLM: Dynamic Layer Routing in LLMs☆41Oct 15, 2025Updated 3 months ago
- Research Paper: "Graph Contrastive Learning as a Versatile Foundation for Advanced scRNA-seq Data Analysis"☆10Nov 20, 2024Updated last year
- Use Actions to acquire those precious lambda GPUs☆19Sep 7, 2023Updated 2 years ago
- QLoRA with Enhanced Multi GPU Support☆37Aug 8, 2023Updated 2 years ago
- Implementation of MixCE method described in ACL 2023 paper by Zhang et al.☆20May 29, 2023Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- Make triton easier☆50Jun 12, 2024Updated last year
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆595Aug 12, 2025Updated 6 months ago
- ☆45Oct 13, 2023Updated 2 years ago
- What would you do with 1000 H100s...☆1,151Jan 10, 2024Updated 2 years ago
- ☆20Jul 12, 2023Updated 2 years ago
- Annotated version of the Mamba paper☆496Feb 27, 2024Updated last year
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Jun 13, 2023Updated 2 years ago
- ☆22Apr 22, 2024Updated last year
- PyTorch implementation of NMT models along with custom tokenizers, models, and datasets☆21Aug 1, 2022Updated 3 years ago