Jaykef / ai-algorithms
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers.
☆162Updated last month
Alternatives and similar repositories for ai-algorithms
Users that are interested in ai-algorithms are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆168Updated last month
- minimal GRPO implementation from scratch☆88Updated 2 months ago
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆103Updated this week
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆140Updated 2 months ago
- The official implementation of Tensor ProducT ATTenTion Transformer (T6)☆368Updated 3 weeks ago
- ☆151Updated last week
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆116Updated 11 months ago
- LoRA and DoRA from Scratch Implementations☆202Updated last year
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆323Updated 5 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆191Updated last month
- Exploring Applications of GRPO☆206Updated last week
- Build your own visual reasoning model☆359Updated this week
- Naively combining transformers and Kolmogorov-Arnold Networks to learn and experiment☆35Updated 9 months ago
- TransMLA: Multi-Head Latent Attention Is All You Need☆247Updated this week
- ☆30Updated last week
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆165Updated 3 weeks ago
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆28Updated 2 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆91Updated this week
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆166Updated 4 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆154Updated last month
- working implimention of deepseek MLA☆41Updated 4 months ago
- ☆176Updated 5 months ago
- From scratch implementation of a vision language model in pure PyTorch☆214Updated last year
- Official repo of paper LM2☆39Updated 3 months ago
- Tina: Tiny Reasoning Models via LoRA☆192Updated 3 weeks ago
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆104Updated last month
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆172Updated last month
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning☆356Updated 9 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆42Updated 11 months ago