computerhistory / AlexNet-Source-CodeLinks
This package contains the original 2012 AlexNet code.
☆2,689Updated 5 months ago
Alternatives and similar repositories for AlexNet-Source-Code
Users that are interested in AlexNet-Source-Code are comparing it to the libraries listed below
Sorting:
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆5,589Updated last week
- DeepEP: an efficient expert-parallel communication library☆8,375Updated this week
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.☆2,841Updated 5 months ago
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆2,001Updated last month
- Sky-T1: Train your own O1 preview model within $450☆3,320Updated last month
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆3,098Updated last month
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,517Updated 3 months ago
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆5,007Updated 5 months ago
- FlashMLA: Efficient MLA kernels☆11,683Updated last week
- AlphaFold 3 inference pipeline.☆6,838Updated last week
- ☆1,179Updated 3 weeks ago
- Code for BLT research paper☆1,765Updated 2 months ago
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,762Updated last year
- s1: Simple test-time scaling☆6,527Updated last month
- ☆972Updated 3 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,687Updated last week
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆2,849Updated last year
- Large Concept Models: Language modeling in a sentence representation space☆2,257Updated 6 months ago
- Muon is Scalable for LLM Training☆1,258Updated last week
- Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…☆3,452Updated 2 months ago
- Analyze computation-communication overlap in V3/R1.☆1,088Updated 4 months ago
- MoBA: Mixture of Block Attention for Long-Context LLMs☆1,857Updated 4 months ago
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆3,855Updated last week
- Expert Parallelism Load Balancer☆1,244Updated 4 months ago
- Everything about the SmolLM and SmolVLM family of models☆3,108Updated last week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,889Updated 2 months ago
- Democratizing Reinforcement Learning for LLMs☆3,979Updated this week
- Simple RL training for reasoning☆3,714Updated last week
- This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov☆1,881Updated 2 months ago
- MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining☆1,524Updated 2 months ago