YuvrajSingh-mist / SmolLlama
So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset form HuggingFace consisting of 15 M texts (10BT snapshot) for a total of full 3 epochs
☆14Updated 3 weeks ago
Alternatives and similar repositories for SmolLlama:
Users that are interested in SmolLlama are comparing it to the libraries listed below
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated 2 months ago
- Transformers from scratch using PyTorch & NumPy.☆22Updated 2 months ago
- ☆45Updated 2 weeks ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆95Updated last month
- ☆74Updated 6 months ago
- ☆40Updated 2 months ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆34Updated 10 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆21Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆60Updated 3 weeks ago
- working implimention of deepseek MLA☆40Updated 3 months ago
- Collection of resources for RL and Reasoning☆25Updated 2 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated 6 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated 2 weeks ago
- Simple examples using Argilla tools to build AI☆52Updated 4 months ago
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆40Updated last month
- ☆21Updated 5 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆60Updated last week
- ☆84Updated last week
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆63Updated 5 months ago
- An introduction to LLM Sampling☆77Updated 4 months ago
- Video+code lecture on building nanoGPT from scratch☆66Updated 10 months ago
- ☆16Updated last month
- look how they massacred my boy☆63Updated 6 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- ☆97Updated this week
- LLM reads a paper and produce a working prototype☆52Updated this week
- Find your Twin Celebrity in Vector Space☆34Updated 3 months ago
- Train transformer language models with reinforcement learning.☆18Updated last month