J-Rosser-UK / Torch2Jax-DeepSeek-R1-Distill-Qwen-1.5B
Flax (Jax) implementation of DeepSeek-R1-Distill-Qwen-1.5B with weights ported from Hugging Face.
☆16Updated last month
Alternatives and similar repositories for Torch2Jax-DeepSeek-R1-Distill-Qwen-1.5B:
Users that are interested in Torch2Jax-DeepSeek-R1-Distill-Qwen-1.5B are comparing it to the libraries listed below
- ☆74Updated 4 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆76Updated 3 weeks ago
- An Open-Ended Agentic Simulator☆45Updated 7 months ago
- 🪐 The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX☆57Updated last year
- Benchmarking Agentic LLM and VLM Reasoning On Games☆126Updated this week
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]☆99Updated last year
- ☆75Updated last week
- ☆87Updated 2 weeks ago
- Official Implementation of "Can Learned Optimization Make Reinforcement Learning Less Difficult"☆22Updated 4 months ago
- MR.Q is a general-purpose model-free reinforcement learning algorithm.☆80Updated last month
- Efficient baselines for autocurricula in JAX.☆186Updated 7 months ago
- ⚡ Flashbax: Accelerated Replay Buffers in JAX☆229Updated this week
- Minimal but scalable implementation of large language models in JAX☆34Updated 5 months ago
- Simple single-file baselines for Q-Learning in pure-GPU setting☆150Updated 2 weeks ago
- ☆215Updated 8 months ago
- ☆193Updated 3 months ago
- Jax/Flax rewrite of Karpathy's nanoGPT☆57Updated 2 years ago
- seqax = sequence modeling + JAX☆151Updated 2 weeks ago
- 🧱 Modula software package☆187Updated this week
- Highly scalable 2D JAX physics engine.☆53Updated 3 weeks ago
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆109Updated 7 months ago
- Accelerated minigrid environments with JAX☆132Updated 8 months ago
- Comparison between GFlowNets & Maximum Entropy RL☆16Updated last year
- Cost aware hyperparameter tuning algorithm☆149Updated 9 months ago
- Contains JAX implementation of algorithms for inverse reinforcement learning☆71Updated 7 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆32Updated last year
- 🏛️A research-friendly codebase for fast experimentation of single-agent reinforcement learning in JAX • End-to-End JAX RL☆304Updated this week
- Implementation of Diffusion Transformer (DiT) in JAX☆270Updated 9 months ago
- Goal-Conditioned Reinforcement Learning with JAX☆137Updated this week
- (Crafter + NetHack) in JAX. ICML 2024 Spotlight.☆295Updated last month