yixiaoer / mistral-v0.2-jaxView external linksLinks
JAX implementation of the Mistral 7b v0.2 model
☆35Jul 3, 2024Updated last year
Alternatives and similar repositories for mistral-v0.2-jax
Users that are interested in mistral-v0.2-jax are comparing it to the libraries listed below
Sorting:
- ☆16Jul 8, 2024Updated last year
- Einsum-like high-level array sharding API for JAX☆34Jul 16, 2024Updated last year
- A set of Python scripts that makes your experience on TPU better☆56Sep 18, 2025Updated 4 months ago
- JAX implementation of LLaMA, aiming to train LLaMA on Google Cloud TPU☆14Jul 22, 2023Updated 2 years ago
- JAX implementation of the Llama 2 model☆216Feb 2, 2024Updated 2 years ago
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 8 months ago
- seqax = sequence modeling + JAX☆170Jul 23, 2025Updated 6 months ago
- Implementation of various equivariant models in JAX☆12Apr 12, 2024Updated last year
- JAX implementation of the Mistral 7b v0.1 model☆13Mar 27, 2024Updated last year
- ☆15Oct 30, 2025Updated 3 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- ☆42Jan 24, 2026Updated 3 weeks ago
- Flexibly track outputs and grad-outputs of torch.nn.Module.☆13Oct 6, 2023Updated 2 years ago
- ☆292Jul 15, 2024Updated last year
- JMP is a Mixed Precision library for JAX.☆211Jan 30, 2025Updated last year
- Annotated implementations of equivariant (graph) neural networks in Jax: EGNN, SEGNN, NequIP.☆41Mar 1, 2025Updated 11 months ago
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated last year
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Mar 24, 2025Updated 10 months ago
- JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer☆24Jun 10, 2023Updated 2 years ago
- ☆32Jul 2, 2025Updated 7 months ago
- An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.☆52Feb 10, 2026Updated last week
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆43Jul 24, 2024Updated last year
- A simple implementation of Hamiltonian Monte Carlo in JAX.☆20Feb 8, 2024Updated 2 years ago
- JAX Synergistic Memory Inspector☆184Jul 16, 2024Updated last year
- Steerable E(3) GNN in jax☆24Oct 1, 2023Updated 2 years ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25May 12, 2025Updated 9 months ago
- GPU Performance Advisor☆65Jul 25, 2022Updated 3 years ago
- A toolkit for scaling law research ⚖☆57Jan 27, 2025Updated last year
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 6 months ago
- A Lorentz-Equivariant Transformer for All of the LHC☆29Jun 11, 2025Updated 8 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Jun 5, 2025Updated 8 months ago
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆194Updated this week
- ☆922Jan 29, 2026Updated 2 weeks ago
- ☆26Dec 3, 2025Updated 2 months ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆75Aug 2, 2024Updated last year
- Stainless neural networks in JAX☆34Feb 3, 2026Updated 2 weeks ago
- Awesome Triton Resources☆39Apr 27, 2025Updated 9 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆476Feb 3, 2026Updated 2 weeks ago