evintunador / minGemmaLinks

a simplified version of Google's Gemma model to be used for learning

☆26

Alternatives and similar repositories for minGemma

Users that are interested in minGemma are comparing it to the libraries listed below

Sorting:

nivibilla / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆69Updated last year
QuixiAI / grokadamw
☆135Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 5 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆105Updated 6 months ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆59Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
QuixiAI / kraken
☆67Updated last year
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆102Updated last year
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated last week
NousResearch / Obsidian
Maybe the new state of the art vision model? we'll see 🤷‍♂️
☆167Updated last year
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆38Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆161Updated 2 years ago
flawedmatrix / mamba-ssm
Implementation of mamba with rust
☆88Updated last year
tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆68Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆55Updated 8 months ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆155Updated 11 months ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆201Updated last year
teknium1 / ShareGPT-Builder
☆116Updated 9 months ago
sanchit-gandhi / notebooks
A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).
☆46Updated last year
kyegomez / OpenStrawberry
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆29Updated this week
mistralai / vllm-release
A high-throughput and memory-efficient inference and serving engine for LLMs
☆52Updated last year
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated 11 months ago
kyegomez / Andromeda
An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast
☆152Updated last year
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆193Updated last year
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆89Updated 4 months ago
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆162Updated last month
official-elinas / zeus-llm-trainer
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆70Updated 2 years ago
mobiusml / aana_sdk
Aana SDK is a powerful framework for building AI enabled multimodal applications.
☆52Updated last month
catid / bitnet_cpu
Experiments with BitNet inference on CPU
☆54Updated last year