computerhistory / AlexNet-Source-CodeLinks

This package contains the original 2012 AlexNet code.

☆2,736

Alternatives and similar repositories for AlexNet-Source-Code

Users that are interested in AlexNet-Source-Code are comparing it to the libraries listed below

Sorting:

huggingface / nanoVLM
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,136Updated this week
poloclub / transformer-explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
☆5,619Updated 2 weeks ago
simplescaling / s1
s1: Simple test-time scaling
☆6,567Updated 3 months ago
facebookresearch / blt
Code for BLT research paper
☆1,995Updated 4 months ago
deepseek-ai / DeepSeek-Prover-V2
☆1,195Updated 3 months ago
ImagineAILab / ai-by-hand-excel
☆5,429Updated 8 months ago
aburkov / theLMbook
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
☆1,946Updated 4 months ago
NovaSky-AI / SkyThought
Sky-T1: Train your own O1 preview model within $450
☆3,341Updated 3 months ago
policy-gradient / GRPO-Zero
Implementing DeepSeek R1's GRPO algorithm from scratch
☆1,609Updated 6 months ago
KellerJordan / modded-nanogpt
NanoGPT (124M) in 3 minutes
☆3,176Updated 3 months ago
huggingface / picotron
Minimalistic 4D-parallelism distributed training framework for education purpose
☆1,856Updated last month
huggingface / smollm
Everything about the SmolLM and SmolVLM family of models
☆3,314Updated last month
facebookresearch / vjepa2
PyTorch code and models for VJEPA2 self-supervised learning from video.
☆2,295Updated last month
Jiayi-Pan / TinyZero
Minimal reproduction of DeepSeek R1-Zero
☆12,258Updated 5 months ago
KellerJordan / Muon
Muon is an optimizer for hidden layers in neural networks
☆1,888Updated 3 months ago
stanford-cs336 / spring2025-lectures
☆1,536Updated last week
ML-GSAI / LLaDA
Official PyTorch implementation for "Large Language Diffusion Models"
☆3,030Updated this week
going-doer / Paper2Code
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
☆3,460Updated 3 months ago
FareedKhan-dev / train-deepseek-r1
Building DeepSeek R1 from Scratch
☆708Updated 6 months ago
facebookresearch / large_concept_model
Large Concept Models: Language modeling in a sentence representation space
☆2,291Updated 8 months ago
natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆1,259Updated 2 weeks ago
naklecha / llama3-from-scratch
llama3 implementation one matrix multiplication at a time
☆15,172Updated last year
QwenLM / Qwen2.5-Omni
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…
☆3,728Updated 4 months ago
facebookresearch / MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
☆1,379Updated 5 months ago
sachinhosmani / torchvista
Interactive Pytorch forward pass visualization in notebooks
☆594Updated 2 weeks ago
jiachenzhu / DyT
Code release for DynamicTanh (DyT)
☆1,019Updated 6 months ago
mbzuai-oryx / Awesome-LLM-Post-training
Awesome Reasoning LLM Tutorial/Survey/Guide
☆2,100Updated 3 months ago
karpathy / nanochat
The best ChatGPT that $100 can buy.
☆19,081Updated this week
mlfoundations / dclm
DataComp for Language Models
☆1,375Updated last month
hkproj / pytorch-paligemma
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
☆562Updated 10 months ago