yixiaoer / mistral-jaxLinks
JAX implementation of the Mistral 7b v0.1 model
☆13Updated last year
Alternatives and similar repositories for mistral-jax
Users that are interested in mistral-jax are comparing it to the libraries listed below
Sorting:
- flexible meta-learning in jax☆16Updated 2 years ago
- General Modules for JAX☆72Updated 3 months ago
- Building blocks for productive research☆67Updated this week
- ☆89Updated last year
- ☆35Updated last year
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆120Updated last year
- GPT implementation in Flax☆18Updated 4 years ago
- An implementation of MuZero in JAX.☆58Updated 3 years ago
- Accelerated replay buffers in JAX☆46Updated 3 years ago
- Standardized Minecraft Diamond Environment for Reinforcement Learning☆32Updated 2 years ago
- JAX implementation of VQVAE/VQGAN autoencoders (+FSQ)☆40Updated last year
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…☆59Updated 3 years ago
- ☆19Updated 2 years ago
- Minimal but scalable implementation of large language models in JAX☆35Updated last month
- Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu☆106Updated 3 years ago
- ☆57Updated last year
- Docker containers of baseline agents for the Crafter environment☆30Updated 4 years ago
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]☆112Updated 2 years ago
- Baselines for gymnax 🤖☆74Updated 2 years ago
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆57Updated last year
- Code for Discovered Policy Optimisation (NeurIPS 2022)☆12Updated 2 years ago
- Recall to Imagine, a model-based RL algorithm with superhuman memory. Oral (1.2%) @ ICLR 2024☆79Updated last year
- ☆31Updated 3 years ago
- JAX implementations of core Deep RL algorithms☆82Updated 3 years ago
- Drop-in environment replacements that make your RL algorithm train faster.☆21Updated last year
- Scalable Opponent Shaping Experiments in JAX☆25Updated last year
- Implementations of Temporal Difference InfoNCE (TD InfoNCE)☆34Updated 2 years ago
- An Open-Ended Agentic Simulator☆58Updated last year
- ☆16Updated last year
- SkillHack: A Benchmark for Skill Transfer in Open-Ended Reinforcement Learning☆17Updated 3 years ago