abacaj / train-with-fsdpView external linksLinks
☆94Oct 5, 2023Updated 2 years ago
Alternatives and similar repositories for train-with-fsdp
Users that are interested in train-with-fsdp are comparing it to the libraries listed below
Sorting:
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆725Oct 11, 2023Updated 2 years ago
- Utilities for Training Very Large Models☆58Sep 25, 2024Updated last year
- batched loras☆349Sep 6, 2023Updated 2 years ago
- Blocker Hacks☆14Apr 8, 2022Updated 3 years ago
- Code for generating colinraffel.com and my CV☆16Jan 28, 2026Updated 2 weeks ago
- Re-implementation of local descriptor HardNet training in fasta2+kornia☆21Apr 6, 2020Updated 5 years ago
- ☆23Jul 10, 2023Updated 2 years ago
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated 11 months ago
- Object recognition with Pepper using a deep learning model☆10Sep 16, 2021Updated 4 years ago
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 4 months ago
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNC☆14Apr 17, 2023Updated 2 years ago
- A chat implementation for FastHTML☆11Sep 14, 2025Updated 5 months ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year
- ☆10Apr 21, 2024Updated last year
- ☆48Aug 29, 2024Updated last year
- ☆74Sep 5, 2023Updated 2 years ago
- 📃 A curated list of all possible resources (tools, tutorials, platforms, etc) an andrew email can get you☆13Nov 15, 2024Updated last year
- PharML is a framework for predicting compound affinity for protein structures. It utilizes a novel Molecular-Highway Graph Neural Network…☆13May 8, 2020Updated 5 years ago
- ☆14Oct 18, 2023Updated 2 years ago
- This repo lets you run mistral-7b in Google Colab.☆16Oct 1, 2023Updated 2 years ago
- A simple uv workspace☆19Apr 5, 2025Updated 10 months ago
- ☆19Aug 10, 2024Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆204Aug 10, 2024Updated last year
- ☆24Sep 2, 2022Updated 3 years ago
- ☆17Jul 28, 2023Updated 2 years ago
- An introduction to DSPy☆33Aug 30, 2025Updated 5 months ago
- Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.☆40Feb 6, 2026Updated last week
- The repository implements a set of algorithms from the book "Methods in Computational Science", written by Johan Hoffman and published by…☆19Jan 10, 2025Updated last year
- ☆17Apr 7, 2022Updated 3 years ago
- ☆16Jun 4, 2016Updated 9 years ago
- ☆19Oct 2, 2023Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- Exploring Applications of GRPO☆251Aug 25, 2025Updated 5 months ago
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Mar 16, 2025Updated 10 months ago
- Port of Facebook's LLaMA model in C/C++☆21Nov 6, 2023Updated 2 years ago
- Sparsify transformers with SAEs and transcoders☆692Updated this week
- 🤖 A PyTorch library of curated Transformer models and their composable components☆894Apr 17, 2024Updated last year
- ☆54Apr 13, 2025Updated 10 months ago