VizuaraAI / truly-open-gpt-ossLinks
A truly open version of gpt-oss which shows the entire pre-training from scratch
☆76Updated 2 months ago
Alternatives and similar repositories for truly-open-gpt-oss
Users that are interested in truly-open-gpt-oss are comparing it to the libraries listed below
Sorting:
- Learn the building blocks of how to build gpt-oss from scratch☆105Updated 2 months ago
- Train LLM on Hugging Face infra☆67Updated 2 weeks ago
- Simple & Scalable Pretraining for Neural Architecture Research☆302Updated last month
- ☆107Updated 5 months ago
- ☆158Updated 7 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆108Updated 8 months ago
- Luth is a state-of-the-art series of fine-tuned LLMs for French☆39Updated last month
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆36Updated 6 months ago
- ☆136Updated last year
- Inference, Fine Tuning and many more recipes with Gemma family of models☆274Updated 4 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆99Updated 6 months ago
- ☆62Updated 4 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆166Updated 3 months ago
- All information and news with respect to Falcon-H1 series☆93Updated last month
- minimal GRPO implementation from scratch☆99Updated 8 months ago
- ☆127Updated 2 months ago
- A Reproduction of GDM's Nested Learning Paper☆212Updated last week
- An extension of the nanoGPT repository for training small MOE models.☆215Updated 8 months ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆47Updated 6 months ago
- ☆45Updated this week
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 7 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆133Updated 3 months ago
- ☆45Updated 6 months ago
- Collection of autoregressive model implementation☆86Updated 7 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆66Updated last week
- ☆46Updated 7 months ago
- ☆229Updated 2 months ago
- ☆113Updated 2 months ago
- Sparse Inferencing for transformer based LLMs☆213Updated 3 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆16Updated 8 months ago