Jaykef / ai-algorithmsLinks
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers and demos.
☆180Updated 5 months ago
Alternatives and similar repositories for ai-algorithms
Users that are interested in ai-algorithms are comparing it to the libraries listed below
Sorting:
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Updated last year
- minimal GRPO implementation from scratch☆100Updated 9 months ago
- An extension of the nanoGPT repository for training small MOE models.☆219Updated 9 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆201Updated last month
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆138Updated 7 months ago
- ☆185Updated last month
- Tina: Tiny Reasoning Models via LoRA☆310Updated 3 months ago
- From scratch implementation of a vision language model in pure PyTorch☆254Updated last year
- ☆205Updated last year
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆198Updated this week
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆139Updated last year
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆226Updated last month
- dLLM: Simple Diffusion Language Modeling☆1,504Updated this week
- ☆241Updated 2 months ago
- Exploring Applications of GRPO☆250Updated 4 months ago
- Open-source release accompanying Gao et al. 2025☆450Updated 2 weeks ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121Updated last year
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆526Updated 3 months ago
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆31Updated 10 months ago
- working implimention of deepseek MLA☆45Updated 11 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆108Updated 9 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆353Updated 6 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆104Updated 7 months ago
- Code repository for Black Mamba☆260Updated last year
- A compact LLM pretrained in 9 days by using high quality data☆337Updated 8 months ago
- [NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆435Updated last week
- ☆303Updated 8 months ago
- LoRA and DoRA from Scratch Implementations☆214Updated last year
- Official implementation of "Continuous Autoregressive Language Models"☆677Updated 3 weeks ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆162Updated 8 months ago