FireFlyer Record file format, writer and reader for DL training samples.
☆246Dec 1, 2022Updated 3 years ago
Alternatives and similar repositories for ffrecord
Users that are interested in ffrecord are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- HFAI deep learning models☆163May 25, 2023Updated 2 years ago
- A high-performance distributed file system designed to address the challenges of AI training and inference workloads.☆9,847Mar 30, 2026Updated last month
- Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".☆20Feb 23, 2024Updated 2 years ago
- A lightweight data processing framework built on DuckDB and 3FS.☆4,951Mar 5, 2025Updated last year
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆133Jul 6, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Benchmarking Attention Mechanism in Vision Transformers.☆20Oct 10, 2022Updated 3 years ago
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆5,242Updated this week
- ☆28Jan 20, 2026Updated 3 months ago
- CVFusion is an open-source deep learning compiler to fuse the OpenCV operators.☆33Aug 31, 2022Updated 3 years ago
- MLPerf® Storage Benchmark Suite☆179Apr 28, 2026Updated last week
- DeepEP: an efficient expert-parallel communication library☆9,589Updated this week
- ☆15Jan 21, 2023Updated 3 years ago
- An external memory allocator example for PyTorch.☆16Aug 10, 2025Updated 8 months ago
- m3fs(Make 3FS) is the toolset designed to deploy 3FS cluster.☆59Jan 16, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs☆1,009Mar 3, 2026Updated 2 months ago
- [ICLR 2020] Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma, "I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifie…☆20Dec 30, 2021Updated 4 years ago
- PyTorch Sphinx Theme☆13Apr 25, 2023Updated 3 years ago
- ☆22Apr 22, 2025Updated last year
- Deploy ChatGLM on Modelz☆16Mar 20, 2023Updated 3 years ago
- Some commonly used functions and modules☆10Jan 15, 2024Updated 2 years ago
- RFUSE: Modernizing Userspace Filesystem Framework through Scalable Kernel-Userspace Communication☆77May 8, 2025Updated 11 months ago
- Construction Grammar based BERT☆14Dec 5, 2020Updated 5 years ago
- GHive: Accelerating Analytical Query Processing in Apache Hive via CPU-GPU Heterogeneous Computing.☆14Nov 8, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆36Oct 21, 2022Updated 3 years ago
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.☆2,951Jan 14, 2026Updated 3 months ago
- Some microbenchmarks and design docs before commencement☆11Feb 1, 2021Updated 5 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆13Nov 23, 2024Updated last year
- FlashMLA: Efficient Multi-head Latent Attention Kernels☆12,614Updated this week
- SJTU HPC 开源项目:Spackenv (Spack ENVironment) switch environments between sysadmin, users and developers.☆22Jan 4, 2022Updated 4 years ago
- TVMScript kernel for deformable attention☆25Dec 15, 2021Updated 4 years ago
- This repo consist of some experimental results on bdd100k datasets using different object detection algorithms(Faster-RCNN, FCOS, ATSS)☆11Jun 27, 2020Updated 5 years ago
- Depict GPU memory footprint during DNN training of PyTorch☆11Nov 17, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- Ongoing research training transformer models at scale☆16,203Updated this week
- GPU-scheduler-for-deep-learning☆209Nov 5, 2020Updated 5 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆7,144Apr 24, 2026Updated last week
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆12Apr 1, 2020Updated 6 years ago
- Collections of self-supervised methods, based on cvpods.☆58Aug 21, 2021Updated 4 years ago
- Datasets, Transforms and Models specific to Computer Vision☆91Nov 17, 2023Updated 2 years ago