facebookresearch / FAMBench
Benchmarks to capture important workloads.
☆28Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for FAMBench
- ☆55Updated 5 months ago
- MLPerf™ logging library☆30Updated last week
- An IR for efficiently simulating distributed ML computation.☆25Updated 10 months ago
- ☆48Updated 8 months ago
- This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.☆37Updated 8 months ago
- oneCCL Bindings for Pytorch*☆86Updated 3 weeks ago
- ☆12Updated last month
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- Issues related to MLPerf™ Inference policies, including rules and suggested changes☆57Updated 2 weeks ago
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- extensible collectives library in triton☆71Updated last month
- Benchmark code for the "Online normalizer calculation for softmax" paper☆59Updated 6 years ago
- Python bindings for NVTX☆66Updated last year
- RCCL Performance Benchmark Tests☆50Updated 3 weeks ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆270Updated this week
- A tracing JIT for PyTorch☆17Updated 2 years ago
- ☆29Updated this week
- Home for OctoML PyTorch Profiler☆107Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆35Updated 6 months ago
- ☆67Updated last year
- FTPipe and related pipeline model parallelism research.☆41Updated last year
- Research and development for optimizing transformers☆125Updated 3 years ago
- benchmarking some transformer deployments☆26Updated last year
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆124Updated this week
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆75Updated last week
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆90Updated 4 months ago
- Memory Optimizations for Deep Learning (ICML 2023)☆60Updated 8 months ago
- oneAPI Collective Communications Library (oneCCL)☆206Updated this week
- AMD's graph optimization engine.☆186Updated this week