jax-ml / ml_dtypesLinks
A stand-alone implementation of several NumPy dtype extensions used in machine learning.
☆306Updated 2 weeks ago
Alternatives and similar repositories for ml_dtypes
Users that are interested in ml_dtypes are comparing it to the libraries listed below
Sorting:
- ☆337Updated last week
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆124Updated last month
- TorchFix - a linter for PyTorch-using code with autofix support☆148Updated 2 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆432Updated 3 weeks ago
- OpTree: Optimized PyTree Utilities☆195Updated last week
- JAX-Toolbox☆359Updated last week
- ☆190Updated 2 weeks ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆585Updated this week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆161Updated last month
- Named Tensors for Legible Deep Learning in JAX☆212Updated this week
- A library for unit scaling in PyTorch☆132Updated 4 months ago
- This repository contains the experimental PyTorch native float8 training UX☆223Updated last year
- Orbax provides common checkpointing and persistence utilities for JAX users☆449Updated this week
- torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JA…☆117Updated last week
- Implementation of Flash Attention in Jax☆220Updated last year
- ☆181Updated last year
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆103Updated this week
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆79Updated last year
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆181Updated 2 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆148Updated 2 years ago
- extensible collectives library in triton☆90Updated 7 months ago
- ☆53Updated last year
- High-Performance SGEMM on CUDA devices☆109Updated 9 months ago
- Minimal yet performant LLM examples in pure JAX☆198Updated last month
- An open-source efficient deep learning framework/compiler, written in python.☆733Updated 2 months ago
- ☆21Updated 8 months ago
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆400Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆360Updated this week
- Tokamax: A GPU and TPU kernel library.☆104Updated this week