sayakpaul / keras-xla-benchmarksLinks
Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.
☆37Updated 2 years ago
Alternatives and similar repositories for keras-xla-benchmarks
Users that are interested in keras-xla-benchmarks are comparing it to the libraries listed below
Sorting:
- ☆75Updated 3 years ago
- Cyclemoid implementation for PyTorch☆90Updated 3 years ago
- ☆24Updated 3 years ago
- Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.☆12Updated 2 years ago
- A miniture AI training framework for PyTorch☆42Updated 11 months ago
- ML/DL Math and Method notes☆65Updated 2 years ago
- ☆59Updated last year
- Spio (SPEE-oh) - Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.☆46Updated this week
- Various transformers for FSDP research☆38Updated 3 years ago
- This repository shows various ways of deploying a vision model (TensorFlow) from 🤗 Transformers.☆30Updated 3 years ago
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆54Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- ☆133Updated 2 years ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated 2 years ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆72Updated last year
- Lightning HPO & Training Studio App☆19Updated 2 years ago
- ☆16Updated 2 years ago
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 2 years ago
- ☆125Updated last year
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Updated 3 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆121Updated last year
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- Official pytorch code for "APP: Anytime Progressive Pruning" (DyNN @ ICML, 2022; CLL @ ACML, 2022, SNN @ ICML, 2022 and SlowDNN 2023)☆16Updated 3 years ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆86Updated 2 years ago
- JAX Implementation of Black Forest Labs' Flux.1 family of models☆39Updated last month
- Implements MLP-Mixer (https://arxiv.org/abs/2105.01601) with the CIFAR-10 dataset.☆59Updated 3 years ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆86Updated 2 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆81Updated 2 years ago