evintunador / minGemmaLinks
a simplified version of Google's Gemma model to be used for learning
☆26Updated last year
Alternatives and similar repositories for minGemma
Users that are interested in minGemma are comparing it to the libraries listed below
Sorting:
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- ☆135Updated last year
- Collection of autoregressive model implementation☆86Updated 5 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆105Updated 6 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆59Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- ☆67Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- inference code for mixtral-8x7b-32kseqlen☆102Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last week
- Maybe the new state of the art vision model? we'll see 🤷♂️☆167Updated last year
- Set of scripts to finetune LLMs☆38Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- Implementation of mamba with rust☆88Updated last year
- Cerule - A Tiny Mighty Vision Model☆68Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 8 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated 11 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆201Updated last year
- ☆116Updated 9 months ago
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆46Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 11 months ago
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆152Updated last year
- RWKV in nanoGPT style☆193Updated last year
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆89Updated 4 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆162Updated last month
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated 2 years ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆52Updated last month
- Experiments with BitNet inference on CPU☆54Updated last year