lucadiliello / pytorch-apple-silicon-benchmarks
Performance of PyTorch on Apple Silicon
☆49Updated last year
Alternatives and similar repositories for pytorch-apple-silicon-benchmarks:
Users that are interested in pytorch-apple-silicon-benchmarks are comparing it to the libraries listed below
- PyTorch's full-scratch build and install for Apple Silicon☆29Updated last year
- 3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1☆136Updated 3 years ago
- ☆15Updated 3 years ago
- Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.☆174Updated 2 weeks ago
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆106Updated 3 months ago
- ☆56Updated 2 years ago
- The correct way to resize images or tensors. For Numpy or Pytorch (differentiable).☆16Updated 2 years ago
- Model compression for ONNX☆91Updated 5 months ago
- ☆50Updated 3 years ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆109Updated this week
- Implementation of Flash Attention in Jax☆206Updated last year
- MLX support for the Open Neural Network Exchange (ONNX)☆48Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆57Updated last week
- FlashAttention (Metal Port)☆479Updated 7 months ago
- Graph Neural Network library made for Apple Silicon☆188Updated 6 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆49Updated last week
- ☆80Updated last year
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆32Updated last month
- benchmarking some transformer deployments☆26Updated 2 years ago
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆17Updated this week
- A dashboard for exploring timm learning rate schedulers☆19Updated 5 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 9 months ago
- AdamW optimizer for bfloat16 models in pytorch 🔥.☆32Updated 10 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated last year
- Efficient framework-agnostic data loading☆419Updated 3 weeks ago
- Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch☆183Updated 2 years ago
- Tutorial on how to convert machine learned models into ONNX☆16Updated 2 years ago
- In-depth code associated with my Medium blog post, "How to Load PyTorch Models 340 Times Faster with Ray"☆26Updated 2 years ago
- Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices☆139Updated 2 years ago
- C API for MLX☆106Updated this week