See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md
☆25Dec 22, 2022Updated 3 years ago
Alternatives and similar repositories for vit_10b_fsdp_example
Users that are interested in vit_10b_fsdp_example are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆30Updated this week
- ☆16Apr 10, 2022Updated 4 years ago
- Google TPU optimizations for transformers models☆136Jan 23, 2026Updated 2 months ago
- Pytorch/XLA SPMD Test code in Google TPU☆23Apr 3, 2024Updated 2 years ago
- PyTorch distributed training acceleration framework☆54Aug 13, 2025Updated 8 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆19Jul 1, 2018Updated 7 years ago
- Machine Learning eXperiment Utilities☆48Jul 29, 2025Updated 8 months ago
- An implementation of DreamerV2 written in JAX, with support for running multiple random seeds of an experiment on a single GPU.☆18Jan 16, 2023Updated 3 years ago
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago
- Spectre variant 1 exploitation via PRIME+PROBE☆10May 22, 2019Updated 6 years ago
- Google DeepMind: Mixture of Depths Unofficial Implementation.☆12May 29, 2024Updated last year
- [3DV 2025] CoE: Deep Coupled Embedding for Non-Rigid Point Cloud Correspondences☆19Jan 5, 2026Updated 3 months ago
- Block-Recurrent Dynamics in ViTs 🦖☆34Dec 24, 2025Updated 3 months ago
- The goal of this project is to develop a program for planetary soft landings using lossless convexification of non convex control bounds.☆12Mar 25, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17Mar 8, 2020Updated 6 years ago
- Aemi WordPress theme☆12Jul 2, 2021Updated 4 years ago
- A compiler written in Java to compile a subset of instructions called MiniJava.☆10Apr 20, 2015Updated 10 years ago
- Keras implementation of CycleGAN☆13Dec 11, 2017Updated 8 years ago
- ☆14Sep 28, 2020Updated 5 years ago
- PhoneGap NFC peer to peer demo☆22Jan 6, 2017Updated 9 years ago
- ☆11Jan 18, 2024Updated 2 years ago
- Using MLflow with a PostgreSQL Database Tracking URI and a Minio Artifact URI, and MLflow Registry☆14Sep 17, 2020Updated 5 years ago
- Unofficial entropix impl for Gemma2 and Llama and Qwen2 and Mistral☆17Jan 12, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A simple library for scaling up JAX programs☆146Nov 4, 2025Updated 5 months ago
- Alloy models for automatic synthesis of memory model litmus test suites (from ASPLOS 2017)☆16Jan 26, 2024Updated 2 years ago
- An unsorted collection of little tools and scripts I've made that don't fit anywhere else☆19Jul 15, 2022Updated 3 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- JAX implementation of Large Language Models. You can train GPT-2-like model with 青空文庫 (aozora bunko-clean dataset) or any other text dat…☆13Aug 5, 2024Updated last year
- TPU에서 한국어용 LLM 추론을 위한 Jax/Flax 구현체입니다.☆12Jun 12, 2023Updated 2 years ago
- Implementation of numerous Vision Transformers in Google's JAX and Flax.☆22Aug 30, 2022Updated 3 years ago
- PyTorch centric eager mode debugger☆48Dec 16, 2024Updated last year
- These are my quick and dirty notes trying to follow main Machine Learning, Computer Vision & Deep Learning references☆14Apr 12, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Pile Deduplication Code☆18May 15, 2023Updated 2 years ago
- ☆11Dec 20, 2022Updated 3 years ago
- Instruction Following Eval☆16Jan 16, 2025Updated last year
- Please visit https://github.com/HKUSTDial/NL2SQL360 to get the official code!☆10Sep 1, 2024Updated last year
- ☆57Apr 23, 2024Updated last year
- ☆10Sep 1, 2021Updated 4 years ago
- Like word2vec, except for letters of the alphabet.☆17May 29, 2017Updated 8 years ago