intel / e2eAIOKLinks

Intel® End-to-End AI Optimization Kit

☆32

Alternatives and similar repositories for e2eAIOK

Users that are interested in e2eAIOK are comparing it to the libraries listed below

Sorting:

intel / torch-ccl
oneCCL Bindings for Pytorch*
☆97Updated 2 months ago
facebookresearch / FAMBench
Benchmarks to capture important workloads.
☆31Updated 4 months ago
ray-project / distml
Distributed ML Optimizer
☆32Updated 3 years ago
mlcommons / logging
MLPerf™ logging library
☆36Updated 2 months ago
mlcommons / inference_policies
Issues related to MLPerf™ Inference policies, including rules and suggested changes
☆62Updated last week
deepspeedai / DeepSpeed-Kernels
☆72Updated 3 months ago
facebookresearch / MODel_opt
Memory Optimizations for Deep Learning (ICML 2023)
☆64Updated last year
Distributed-AI / PipeTransformer
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021
☆56Updated 3 years ago
IntelLabs / DyNAS-T
Dynamic Neural Architecture Search Toolkit
☆30Updated 6 months ago
ucbrise / hypersched
Deadline-based hyperparameter tuning on RayTune.
☆31Updated 5 years ago
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆113Updated 2 years ago
pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆157Updated this week
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆38Updated last year
mlcommons / policies
General policies for MLPerf™ including submission rules, coding standards, etc.
☆28Updated this week
petuum / tuun
Hyperparameter tuning via uncertainty modeling
☆47Updated last year
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆31Updated 2 months ago
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆61Updated last week
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆129Updated last month
quic / efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…
☆71Updated this week
marsupialtail / sparsednn
Fast sparse deep learning on CPUs
☆53Updated 2 years ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
lightmatter-ai / INT-FP-QSim
Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.
☆50Updated last year
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
jansel / pytorch-jit-paritybench
☆40Updated 6 months ago
HabanaAI / SynapseAI_Core
SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi
☆42Updated 4 months ago
vllm-project / flash-attention
Fast and memory-efficient exact attention
☆76Updated this week
sandeepkumar-skb / pytorch_custom_op
End to End steps for adding custom ops in PyTorch.
☆23Updated 4 years ago
ray-project / ray_shuffling_data_loader
A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…
☆18Updated 2 years ago
NVIDIA / online-softmax
Benchmark code for the "Online normalizer calculation for softmax" paper
☆94Updated 6 years ago
nod-ai / transformer-benchmarks
benchmarking some transformer deployments
☆26Updated 2 years ago