computerhistory / AlexNet-Source-CodeLinks
This package contains the original 2012 AlexNet code.
☆2,625Updated 2 months ago
Alternatives and similar repositories for AlexNet-Source-Code
Users that are interested in AlexNet-Source-Code are comparing it to the libraries listed below
Sorting:
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆5,376Updated last week
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆3,003Updated this week
- Simple RL training for reasoning☆3,584Updated last month
- DeepEP: an efficient expert-parallel communication library☆7,701Updated last week
- Democratizing Reinforcement Learning for LLMs☆3,291Updated 2 weeks ago
- 🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆3,655Updated last month
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,372Updated last month
- Muon is Scalable for LLM Training☆1,049Updated 2 months ago
- Minimal reproduction of DeepSeek R1-Zero☆11,811Updated last month
- FlashMLA: Efficient MLA decoding kernels☆11,570Updated last month
- CUDA Python: Performance meets Productivity☆2,704Updated this week
- MoBA: Mixture of Block Attention for Long-Context LLMs☆1,777Updated last month
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆4,854Updated 3 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆868Updated last month
- Scalable RL solution for advanced reasoning of language models☆1,587Updated 2 months ago
- Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…☆3,042Updated this week
- Witness the aha moment of VLM with less than $3.☆3,688Updated last week
- 🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton☆2,438Updated this week
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.☆2,784Updated 2 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆8,593Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10,709Updated 2 weeks ago
- ☆3,342Updated 2 months ago
- Sky-T1: Train your own O1 preview model within $450☆3,254Updated 2 weeks ago
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Asy…☆6,880Updated this week
- Analyze computation-communication overlap in V3/R1.☆1,040Updated 2 months ago
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,665Updated last week
- Solve Visual Understanding with Reinforced VLMs☆4,990Updated 2 weeks ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,280Updated last week
- s1: Simple test-time scaling☆6,394Updated last week
- Fully open data curation for reasoning models☆1,793Updated last week