google / gemma.cppLinks
lightweight, standalone C++ inference engine for Google's Gemma models.
☆6,565Updated this week
Alternatives and similar repositories for gemma.cpp
Users that are interested in gemma.cpp are comparing it to the libraries listed below
Sorting:
- The official PyTorch implementation of Google's Gemma models☆5,543Updated 3 months ago
- Gemma open-weight LLM library, from Google DeepMind☆3,702Updated last week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,922Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,093Updated 3 weeks ago
- On-device AI across mobile, embedded and edge for PyTorch☆3,221Updated this week
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,319Updated last year
- Tensor library for machine learning☆13,170Updated this week
- Large World Model -- Modeling Text and Video with Millions Context☆7,339Updated 10 months ago
- High-speed Large Language Model Serving for Local Deployment☆8,329Updated last month
- Modeling, training, eval, and inference code for OLMo☆5,985Updated last week
- An Extensible Deep Learning Library☆2,239Updated last week
- Training LLMs with QLoRA + FSDP☆1,528Updated 10 months ago
- PyTorch native post-training library☆5,484Updated this week
- Implementation for MatMul-free LM.☆3,032Updated last month
- CoreNet: A library for training deep neural networks☆7,023Updated 3 weeks ago
- A simple, performant and scalable Jax LLM!☆1,899Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,741Updated last year
- Inference Llama 2 in one file of pure 🔥☆2,119Updated last year
- Examples in the MLX framework☆7,838Updated 2 weeks ago
- Inference Llama 2 in one file of pure C☆18,735Updated last year
- PyTorch code and models for V-JEPA self-supervised learning from video.☆3,200Updated 6 months ago
- Official inference library for Mistral models☆10,459Updated 5 months ago
- LLM training in simple, raw C/CUDA☆27,588Updated 2 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,609Updated last week
- Blazingly fast LLM inference.☆6,088Updated last week
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,505Updated this week
- A PyTorch native platform for training generative AI models☆4,395Updated this week
- ☆4,092Updated last year
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆1,875Updated last week
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,291Updated last month