Jaykef / ai-algorithmsLinks
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers and demos.
☆177Updated 2 months ago
Alternatives and similar repositories for ai-algorithms
Users that are interested in ai-algorithms are comparing it to the libraries listed below
Sorting:
- minimal GRPO implementation from scratch☆98Updated 6 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆59Updated last year
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆121Updated 5 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆190Updated this week
- An extension of the nanoGPT repository for training small MOE models.☆196Updated 7 months ago
- LoRA and DoRA from Scratch Implementations☆211Updated last year
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆134Updated last year
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆30Updated 7 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆468Updated 2 weeks ago
- working implimention of deepseek MLA☆44Updated 9 months ago
- ☆199Updated 9 months ago
- A compact LLM pretrained in 9 days by using high quality data☆328Updated 6 months ago
- From scratch implementation of a vision language model in pure PyTorch☆243Updated last year
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆206Updated last month
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆120Updated last year
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆131Updated last year
- ☆177Updated 2 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆90Updated 4 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆102Updated 10 months ago
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆191Updated 2 weeks ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆129Updated 8 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 11 months ago
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆186Updated this week
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆105Updated 7 months ago
- PyTorch implementation of models from the Zamba2 series.☆185Updated 8 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆106Updated last week
- Tina: Tiny Reasoning Models via LoRA☆290Updated 2 weeks ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆343Updated 3 months ago
- [NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆396Updated 2 weeks ago
- ☆119Updated last year