Jaykef / ai-algorithmsLinks

First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers and demos.

☆177

Alternatives and similar repositories for ai-algorithms

Users that are interested in ai-algorithms are comparing it to the libraries listed below

Sorting:

fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆94Updated 4 months ago
raymin0223 / mixture_of_recursions
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
☆367Updated this week
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆164Updated 4 months ago
joey00072 / nanoGRPO
nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)
☆113Updated 2 months ago
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆179Updated 4 months ago
arpita8 / Awesome-Mixture-of-Experts-Papers
Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.
☆128Updated 11 months ago
menloresearch / visual-thinker
☆164Updated this week
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆231Updated last year
shangshang-wang / Tina
Tina: Tiny Reasoning Models via LoRA
☆274Updated 2 months ago
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆200Updated 2 weeks ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆56Updated last year
NVlabs / hymba
☆190Updated 7 months ago
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆245Updated 3 weeks ago
Zyphra / BlackMamba
Code repository for Black Mamba
☆252Updated last year
tensorgi / TPA
The official implementation of TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)
☆380Updated last week
mingyin0312 / RL4LLM
RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct
☆29Updated 5 months ago
ByungKwanLee / DeepSick-R1
Reproduction of DeepSeek-R1
☆235Updated 3 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆219Updated last month
kyegomez / LFM
An open source implementation of LFMs from Liquid AI: Liquid Foundation Models
☆179Updated 2 weeks ago
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆121Updated last year
SakanaAI / RLT
Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.
☆324Updated last month
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆207Updated last year
apple / ml-sigmoid-attention
☆293Updated 3 months ago
kyegomez / Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…
☆184Updated 2 weeks ago
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆320Updated 3 months ago
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆344Updated 7 months ago
groundlight / r1_vlm
Build your own visual reasoning model
☆401Updated this week
OpenMachine-ai / transformer-tricks
A collection of tricks and tools to speed up transformer models
☆169Updated 2 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆184Updated 6 months ago
zyushun / Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆431Updated 2 months ago