AI-Hypercomputer / ray-tpuLinks
☆15Updated 6 months ago
Alternatives and similar repositories for ray-tpu
Users that are interested in ray-tpu are comparing it to the libraries listed below
Sorting:
- torchprime is a reference model implementation for PyTorch on TPU.☆41Updated last month
- ☆16Updated 6 months ago
- ☆20Updated 2 years ago
- ☆16Updated last year
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…☆87Updated 2 weeks ago
- ☆121Updated last year
- ☆21Updated 8 months ago
- A set of Python scripts that makes your experience on TPU better☆54Updated 2 months ago
- Machine Learning eXperiment Utilities☆46Updated 4 months ago
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- Various transformers for FSDP research☆38Updated 3 years ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆62Updated last week
- ☆62Updated 3 years ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆116Updated 2 months ago
- DPO, but faster 🚀☆46Updated 11 months ago
- Griffin MQA + Hawk Linear RNN Hybrid☆89Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆74Updated this week
- Implementation of a Light Recurrent Unit in Pytorch☆49Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21Updated 6 months ago
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Updated 2 years ago
- ☆68Updated last year
- Randomized Positional Encodings Boost Length Generalization of Transformers☆83Updated last year
- JAX/Flax implementation of the Hyena Hierarchy☆34Updated 2 years ago
- MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an exc…☆25Updated last month
- ☆50Updated last year
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆28Updated this week
- Official code release for "SuperBPE: Space Travel for Language Models"☆76Updated last week
- Experiment of using Tangent to autodiff triton☆80Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Updated 4 months ago