kabir2505 / tiny-mixtral
☆27Updated this week
Alternatives and similar repositories for tiny-mixtral:
Users that are interested in tiny-mixtral are comparing it to the libraries listed below
- ☆46Updated last month
- NanoGPT-speedrunning for the poor T4 enjoyers☆64Updated 2 weeks ago
- working implimention of deepseek MLA☆41Updated 4 months ago
- Simple GRPO scripts and configurations.☆58Updated 3 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated 2 months ago
- Collection of autoregressive model implementation☆85Updated 2 weeks ago
- minimal GRPO implementation from scratch☆87Updated last month
- ☆48Updated 3 months ago
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆28Updated 2 months ago
- ☆47Updated 8 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 3 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 6 months ago
- ☆16Updated 2 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated last month
- Lego for GRPO☆27Updated last month
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆98Updated 2 months ago
- Simple repository for training small reasoning models☆27Updated 3 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆58Updated 6 months ago
- Multi-Layer Key-Value sharing experiments on Pythia models☆32Updated 10 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆14Updated last month
- Repository containing awesome resources regarding Hugging Face tooling.☆47Updated last year
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated last month
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 4 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- ☆91Updated last month
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated 2 months ago
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆56Updated last month
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆41Updated 2 months ago