mlcommons / training_results_v4.0

This repository contains the results and code for the MLPerf™ Training v4.0 benchmark.

☆12

Alternatives and similar repositories for training_results_v4.0:

Users that are interested in training_results_v4.0 are comparing it to the libraries listed below

HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆158Updated last week
mlcommons / training_results_v3.1
This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
☆17Updated this week
NVIDIA / mlperf-common
NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions
☆24Updated last week
IBM / dolomite-engine
Dolomite Engine is a library for pretraining/finetuning LLMs
☆27Updated this week
facebookresearch / FAMBench
Benchmarks to capture important workloads.
☆29Updated this week
intel / torch-ccl
oneCCL Bindings for Pytorch*
☆87Updated 2 weeks ago
HabanaAI / vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
☆47Updated this week
mlcommons / training_results_v2.1
This repository contains the results and code for the MLPerf™ Training v2.1 benchmark.
☆15Updated last year
NVIDIA / nephele
Tools to deploy GPU clusters in the Cloud
☆30Updated last year
mlcommons / logging
MLPerf™ logging library
☆32Updated last week
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated 9 months ago
graphcore-research / out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆43Updated 6 months ago
aws-neuron / aws-neuron-reference-for-megatron-lm
☆14Updated last year
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆102Updated last month
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆13Updated last month
microsoft / DeepSpeed-Kernels
☆57Updated 7 months ago
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆58Updated last month
NVIDIA / cloudai
CloudAI Benchmark Framework
☆47Updated this week
mlcommons / hpc
Reference implementations of MLPerf™ HPC training benchmarks
☆44Updated 7 months ago
ray-project / distml
Distributed ML Optimizer
☆30Updated 3 years ago
cybertronai / bflm
☆16Updated 5 years ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆63Updated 2 years ago
huggingface / optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
☆165Updated this week
zasdfgbnm / autonvtx
Automatically insert nvtx ranges to PyTorch models
☆17Updated 3 years ago
anyscale / llm-continuous-batching-benchmarks
☆114Updated 10 months ago
axonn-ai / axonn
A parallel framework for training deep neural networks
☆49Updated this week
ROCm / pytorch-micro-benchmarking
☆18Updated last month
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆37Updated 10 months ago
NVIDIA / apt-packaging-fabric-manager
Fabric Manager packaging for Debian
☆14Updated 3 years ago
huggingface / tgi-gaudi
Large Language Model Text Generation Inference on Habana Gaudi
☆29Updated this week