aime-team / pytorch-benchmarksLinks

A benchmark framework for Pytorch

☆31

Alternatives and similar repositories for pytorch-benchmarks

Users that are interested in pytorch-benchmarks are comparing it to the libraries listed below

Sorting:

groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆40Updated 4 months ago
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆15Updated last year
axonn-ai / axonn
Parallel framework for training and fine-tuning deep neural networks
☆70Updated last month
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆94Updated this week
mlcommons / logging
MLPerf™ logging library
☆37Updated this week
deepspeedai / DeepSpeed-Kernels
☆71Updated 8 months ago
Libraries-Openly-Fused / FusedKernelLibrary
Implementation of a methodology that allows all sorts of user defined GPU kernel fusion, for non CUDA programmers.
☆32Updated last week
facebookresearch / any4
Quantize transformers to any learned arbitrary 4-bit numeric format
☆50Updated 5 months ago
determined-ai / determined-examples
Example ML projects that use the Determined library.
☆32Updated last year
FrancescoSaverioZuppichini / pytorch-2.0-benchmark
Benchmarking PyTorch 2.0 different models
☆20Updated 2 years ago
graphcore-research / unit-scaling-demo
Unit Scaling demo and experimentation code
☆16Updated last year
sambanova / tutorials
☆13Updated last year
at-aaims / forge
☆15Updated 7 months ago
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆38Updated 7 months ago
gevtushenko / llm.c
LLM training in simple, raw C/CUDA
☆108Updated last year
reger-men / HPL_GPU
High-Performance Linpack Benchmark adopted version for GPU backend
☆12Updated 3 years ago
AMDResearch / hpcfund
AMD HPC Research Fund Cloud
☆17Updated 3 weeks ago
fw-ai / llama-cuda-graph-example
Example of applying CUDA graphs to LLaMA-v2
☆12Updated 2 years ago
mlcommons / training_results_v2.0
This repository contains the results and code for the MLPerf™ Training v2.0 benchmark.
☆29Updated last year
at-aaims / OpenMxP
This is the open source version of HPL-MXP. The code performance has been verified on Frontier
☆18Updated 5 months ago
foundation-model-stack / fms-acceleration
🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
☆13Updated last week
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆137Updated 6 months ago
mag- / gpu_benchmark
Gpu benchmark
☆73Updated 10 months ago
UmerHA / triton_util
Make triton easier
☆49Updated last year
NVIDIA / mlperf-common
NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions
☆35Updated 3 months ago
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆131Updated 2 months ago
meta-pytorch / superblock
A block oriented training approach for inference time optimization.
☆34Updated last year
ROCm / pytorch-micro-benchmarking
☆22Updated last month
rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…
☆92Updated 2 years ago
7shoe / AdaParse
Adaptive Parallel PDF Parsing and Resource Scaling Engine
☆62Updated last month