dhyaneesh / awesome-jax-flax-llmsLinks
A collection of open-source large language model (LLM) implementations in JAX & Flax
☆23Updated 5 months ago
Alternatives and similar repositories for awesome-jax-flax-llms
Users that are interested in awesome-jax-flax-llms are comparing it to the libraries listed below
Sorting:
- ☆28Updated last year
- Automated Capability Discovery via Foundation Model Self-Exploration☆64Updated 7 months ago
- Code for the paper Don't Pay Attention☆49Updated this week
- Lego for GRPO☆29Updated 4 months ago
- Clue inspired puzzles for testing LLM deduction abilities☆43Updated 6 months ago
- Train your own SOTA deductive reasoning model☆106Updated 6 months ago
- [ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)☆20Updated last year
- story based implementation for sequential thinking☆14Updated 2 weeks ago
- ☆62Updated 2 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 5 months ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆104Updated 6 months ago
- A pure and fast NumPy implementation of Mamba with cache support.☆17Updated last year
- The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers☆91Updated this week
- a single interface around speech-to-speech foundation models☆25Updated 3 months ago
- The official repository of ALE-Bench☆114Updated this week
- Clean RL implementation using MLX☆33Updated last year
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆32Updated last year
- A high-performance attention mechanism that computes softmax normalization in a single streaming pass using running accumulators (online …☆26Updated 3 weeks ago
- Functional local implementations of main model parallelism approaches☆96Updated 2 years ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 5 months ago
- Code for minimum-entropy coupling.☆32Updated last year
- Training-Ready RL Environments + Evals☆111Updated this week
- OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…☆14Updated this week
- ☆68Updated 4 months ago
- lossily compress representation vectors using product quantization☆59Updated 5 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆40Updated last year
- Framework-Agnostic RL Environments for LLM Fine-Tuning☆36Updated last week
- Enable moe for nanogpt.☆34Updated last year
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆58Updated 5 months ago
- Asterisk Model Context Protocol (MCP) server.☆25Updated 6 months ago