HFAiLab / ffrecord
FireFlyer Record file format, writer and reader for DL training samples.
☆121Updated 2 years ago
Alternatives and similar repositories for ffrecord:
Users that are interested in ffrecord are comparing it to the libraries listed below
- HFAI deep learning models☆99Updated last year
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆132Updated last year
- ☆211Updated last year
- Zero Bubble Pipeline Parallelism☆309Updated 2 months ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆106Updated last year
- Tutorials for writing high-performance GPU operators in AI frameworks.☆126Updated last year
- Megvii FILE Library - Working with Files in Python same as the standard library☆134Updated this week
- ☆76Updated last year
- The test of different distributed-training methods on High-Flyer AIHPC☆22Updated 2 years ago
- Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.☆266Updated last year
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- ☆35Updated last month
- OneFlow models for benchmarking.☆105Updated 5 months ago
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆397Updated 2 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆88Updated 10 months ago
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆365Updated last year
- Automated Parallelization System and Infrastructure for Multiple Ecosystems☆76Updated 2 months ago
- ATC23 AE☆44Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆64Updated 7 months ago
- A collection of memory efficient attention operators implemented in the Triton language.☆230Updated 7 months ago
- pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用☆87Updated 10 months ago
- Datasets, Transforms and Models specific to Computer Vision☆84Updated last year
- A Python library transfers PyTorch tensors between CPU and NVMe☆102Updated last month
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆44Updated 6 months ago
- pytorch-profiler☆50Updated last year
- The pure and clear PyTorch Distributed Training Framework.☆275Updated 11 months ago
- PyTorch bindings for CUTLASS grouped GEMM.☆84Updated 2 weeks ago
- [IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…☆51Updated last year
- ☆72Updated 5 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆208Updated this week