aime-team / pytorch-benchmarksLinks
A benchmark framework for Pytorch
☆31Updated 9 months ago
Alternatives and similar repositories for pytorch-benchmarks
Users that are interested in pytorch-benchmarks are comparing it to the libraries listed below
Sorting:
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 4 months ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆15Updated last year
- Parallel framework for training and fine-tuning deep neural networks☆70Updated last month
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week
- MLPerf™ logging library☆37Updated this week
- ☆71Updated 8 months ago
- Implementation of a methodology that allows all sorts of user defined GPU kernel fusion, for non CUDA programmers.☆32Updated last week
- Quantize transformers to any learned arbitrary 4-bit numeric format☆50Updated 5 months ago
- Example ML projects that use the Determined library.☆32Updated last year
- Benchmarking PyTorch 2.0 different models☆20Updated 2 years ago
- Unit Scaling demo and experimentation code☆16Updated last year
- ☆13Updated last year
- ☆15Updated 7 months ago
- A collection of reproducible inference engine benchmarks☆38Updated 7 months ago
- LLM training in simple, raw C/CUDA☆108Updated last year
- High-Performance Linpack Benchmark adopted version for GPU backend☆12Updated 3 years ago
- AMD HPC Research Fund Cloud☆17Updated 3 weeks ago
- Example of applying CUDA graphs to LLaMA-v2☆12Updated 2 years ago
- This repository contains the results and code for the MLPerf™ Training v2.0 benchmark.☆29Updated last year
- This is the open source version of HPL-MXP. The code performance has been verified on Frontier☆18Updated 5 months ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Updated last week
- Train, tune, and infer Bamba model☆137Updated 6 months ago
- Gpu benchmark☆73Updated 10 months ago
- Make triton easier☆49Updated last year
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆35Updated 3 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 2 months ago
- A block oriented training approach for inference time optimization.☆34Updated last year
- ☆22Updated last month
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- Adaptive Parallel PDF Parsing and Resource Scaling Engine☆62Updated last month