HamzaElshafie / gpt-oss-20BView on GitHub
A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Self-Attention with learned sinks, banded attention, GQA, and KV-cache.
β˜†226Dec 2, 2025Updated 4 months ago

Alternatives and similar repositories for gpt-oss-20B

Users that are interested in gpt-oss-20B are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?