Jaykef / ai-algorithmsLinks
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers and demos.
☆178Updated last month
Alternatives and similar repositories for ai-algorithms
Users that are interested in ai-algorithms are comparing it to the libraries listed below
Sorting:
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆420Updated 3 weeks ago
- An extension of the nanoGPT repository for training small MOE models.☆178Updated 5 months ago
- minimal GRPO implementation from scratch☆96Updated 5 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆56Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆183Updated last week
- Tina: Tiny Reasoning Models via LoRA☆275Updated 2 weeks ago
- ☆194Updated 8 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆129Updated last year
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆116Updated 3 months ago
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆185Updated last week
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆30Updated 6 months ago
- ☆173Updated 3 weeks ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆220Updated 2 months ago
- ☆44Updated 3 months ago
- LoRA and DoRA from Scratch Implementations☆209Updated last year
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆329Updated 2 months ago
- From scratch implementation of a vision language model in pure PyTorch☆235Updated last year
- ☆405Updated this week
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆109Updated 10 months ago
- working implimention of deepseek MLA☆43Updated 7 months ago
- ☆294Updated 4 months ago
- Exploring Applications of GRPO☆246Updated last month
- Code repository for Black Mamba☆254Updated last year
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆105Updated last week
- PyTorch implementation of models from the Zamba2 series.☆184Updated 7 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆241Updated last year
- The official implementation of TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆381Updated this week
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆202Updated 2 weeks ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆121Updated 6 months ago
- A compact LLM pretrained in 9 days by using high quality data☆322Updated 4 months ago