A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Self-Attention with learned sinks, banded attention, GQA, and KV-cache.
☆234Dec 2, 2025Updated 6 months ago
Alternatives and similar repositories for gpt-oss-20B
Users that are interested in gpt-oss-20B are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Structure and Interpretation of Tensor Programs: The Hacker's Accelerated Introduction to Deep Learning and Deep Learning Systems☆80Updated this week
- Research on training an LLM with DeepSeek & Kimi architecture☆50Sep 30, 2025Updated 8 months ago
- 🏆 Ambassador Paper for Innovative Use of NLP for Building Educational Applications 2023: Is ChatGPT a Good Teacher Coach? Measuring Zero…☆14Jul 21, 2024Updated last year
- This repository hosts the code to port NumPy model weights of BiT-ResNets to TensorFlow SavedModel format.☆14Dec 21, 2021Updated 4 years ago
- This repository hosts code for converting the original MLP Mixer models (JAX) to TensorFlow.☆15Sep 29, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Minimal implementation of Denoised Smoothing (https://arxiv.org/abs/2003.01908) in TensorFlow.☆20Aug 4, 2021Updated 4 years ago
- Neural Arithmetic Logic Units by Trask et al.☆12Apr 10, 2019Updated 7 years ago
- Code for the article series on building a Python compiler and interpreter☆12Feb 13, 2025Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆17Mar 11, 2026Updated 3 months ago
- [CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆82May 12, 2026Updated last month
- CapsNet implementation in a minimal manner☆11Nov 17, 2017Updated 8 years ago
- Showcases the use of deep learning to detect wheat heads from crops. The project is based on: https://www.kaggle.com/c/global-wheat-detec…☆19May 30, 2020Updated 6 years ago
- PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation☆33Dec 29, 2021Updated 4 years ago
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Learn how to design large-scale systems. Prep for the system design interview. An update to the original system-design-primer☆33Jan 12, 2026Updated 5 months ago
- Mixture of Experts from scratch☆14Apr 12, 2024Updated 2 years ago
- Antenna analyzer based on RigExpert Zero II and Arduino☆13Jan 25, 2024Updated 2 years ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Oct 20, 2022Updated 3 years ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- Confidential inference in enclave for OpenAI grant. Uses k3s and Triton☆16Mar 20, 2025Updated last year
- ☆130Dec 9, 2025Updated 6 months ago
- Minimal JAX implementation unifying Diffusion and Flow Matching algorithms as alternative strategies for transporting data distributions.☆66Dec 19, 2025Updated 6 months ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆37Sep 15, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- collab-dev - Collaboration Metrics for Code Reviews☆23May 12, 2025Updated last year
- ☆11May 16, 2026Updated last month
- ☆23Oct 30, 2019Updated 6 years ago
- ☆60Dec 12, 2025Updated 6 months ago
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆33May 26, 2026Updated last month
- SciFin is a python package for Science & Finance.☆11Oct 25, 2020Updated 5 years ago
- A simple, generic, and flexible keyframe animation library for Rust.☆30Jun 1, 2026Updated 3 weeks ago
- 🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)☆10Jun 22, 2026Updated last week
- ☆41Feb 23, 2026Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- My tests and experiments with some popular dl frameworks.☆17Sep 11, 2025Updated 9 months ago
- GEMV implementation with CUTLASS☆21Aug 21, 2025Updated 10 months ago
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- ☆16Jul 7, 2025Updated 11 months ago
- Custom ComfyUI node that combines VSR + VFI and allows streaming processing for arbitrary video length.☆66Mar 28, 2026Updated 3 months ago
- A multimodal live AI assistant designed to enhance the browsing experience using Gemini.☆11Feb 15, 2025Updated last year
- 《汇编语言一发入魂》配套代码☆15May 30, 2020Updated 6 years ago