iwiwi / epochraft-hf-fsdpView external linksLinks
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Jan 29, 2024Updated 2 years ago
Alternatives and similar repositories for epochraft-hf-fsdp
Users that are interested in epochraft-hf-fsdp are comparing it to the libraries listed below
Sorting:
- Support Continual pre-training & Instruction Tuning forked from llama-recipes☆34Feb 17, 2024Updated 2 years ago
- Checkpointable dataset utilities for foundation model training☆32Jan 29, 2024Updated 2 years ago
- Ongoing research training Mixture of Expert models.☆21Sep 16, 2024Updated last year
- ☆20Aug 28, 2024Updated last year
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- ☆53May 20, 2024Updated last year
- ☆22Sep 18, 2023Updated 2 years ago
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as…☆18Sep 17, 2025Updated 5 months ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆125Nov 13, 2025Updated 3 months ago
- ☆35Feb 26, 2024Updated last year
- Advanced block device testing/file system testing, targetting SNIA compatible reporting☆12Oct 15, 2025Updated 4 months ago
- Mamba training library developed by kotoba technologies☆71Feb 11, 2024Updated 2 years ago
- Python wrappers for the FirecREST API☆12Dec 23, 2025Updated last month
- Lustre Repository with MS patches☆13Updated this week
- ☆48Jan 5, 2026Updated last month
- A framework for few-shot evaluation of autoregressive language models.☆154Sep 13, 2024Updated last year
- Wantedlyのインターン情報や新卒採用についてのインフォメーションです☆11Apr 5, 2022Updated 3 years ago
- ☆25Jan 18, 2026Updated 3 weeks ago
- Lustre HSM tools☆10Feb 19, 2024Updated last year
- ☆12Jul 7, 2022Updated 3 years ago
- This repository holds all material related to the Ory Summit, specifically the presentations.☆12Oct 22, 2025Updated 3 months ago
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- Auto detection of apt proxies in the LAN, caching and checking status☆10Feb 13, 2025Updated last year
- Cloyster HPC is a turnkey HPC cluster solution with an user-friendly installer☆10Oct 2, 2025Updated 4 months ago
- extended benchmarking automation tool for HPC applications☆16Updated this week
- Crawl & Visualize NeurIPS 2022 Data from OpenReview☆14Nov 8, 2022Updated 3 years ago
- ☆13Mar 3, 2025Updated 11 months ago
- [ICLR'25] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yue…☆17Mar 21, 2025Updated 10 months ago
- Collection of my personal helper scripts.☆14Dec 23, 2025Updated last month
- Tool to profile usage of HPC resources by regularly probing processes.☆11Updated this week
- Assignments of CSCE-642: Deep Reinforcement Learning offered at Texas A&M University.☆10Aug 31, 2025Updated 5 months ago
- Large language models to diffusion finetuning code☆23Jun 2, 2025Updated 8 months ago
- Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)☆10Feb 21, 2023Updated 2 years ago
- Ongoing Research Project for continaual pre-training LLM(dense mode)☆44Mar 3, 2025Updated 11 months ago
- Collection of Singularity build files and scripts to create them for popular Linux Distributions☆10Jun 23, 2022Updated 3 years ago
- Telegram bot which knows IPv6 excuses.☆11Mar 24, 2018Updated 7 years ago
- Script for doing Slurm Calculations☆12Mar 21, 2025Updated 10 months ago
- CLI tools for Slurm clusters☆13Dec 19, 2025Updated last month
- Source code for "N-ary Constituent Tree Parsing with Recursive Semi-Markov Model" published at ACL 2021☆10May 27, 2021Updated 4 years ago