AmeyaWagh / llama2.cpp

Inference Llama 2 in C++

☆45

Related projects ⓘ

Alternatives and complementary repositories for llama2.cpp

UmerHA / triton_util
Make triton easier
☆41Updated 4 months ago
zaydzuhri / pythia-mlkv
Multi-Layer Key-Value sharing experiments on Pythia models
☆32Updated 4 months ago
AlexBodner / How_Much_VRAM
☆93Updated 2 months ago
SebastianBodza / EnsembleForecasting
Using multiple LLMs for ensemble Forecasting
☆16Updated 9 months ago
andrew-silva / mlx-rlhf
An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.
☆20Updated 4 months ago
bipul1010 / agents_tutorial
☆19Updated 3 months ago
mzbac / mlx-moe
Scripts to create your own moe models using mlx
☆86Updated 8 months ago
johanndiep / language-models-trajectory-generators
This repository contains a fork from "language-models-trajectory-generators", the goal is to test the same functionality with Mistrals LL…
☆19Updated last month
YuchenJin / llm.c
LLM training in simple, raw C/CUDA
☆12Updated last month
character-ai / MuKoe
☆51Updated 6 months ago
austinsilveria / tricksy
Fast approximate inference on a single GPU with sparsity aware offloading
☆38Updated 10 months ago
mikex86 / tritonc
Standalone commandline CLI tool for compiling Triton kernels
☆15Updated last month
Jaykef / mlx-rag-gguf
Minimal, clean code implementation of RAG with mlx using gguf model weights
☆43Updated 6 months ago
v-prgmr / mergekit
Tools for merging pretrained large language models.
☆19Updated 4 months ago
glaive-ai / reflection_70b_training
☆16Updated last month
deepshard / mixtral-8x7b-Inference
Eh, simple and works.
☆27Updated 11 months ago
shivance / minbpe.c
a Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization in pure C.
☆21Updated 4 months ago
Zyphra / transformers_zamba2
☆40Updated 3 weeks ago
diicellman / dynamite-dogs
BH hackathon
☆14Updated 7 months ago
AtakanTekparmak / tiny_fnc_engine
tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.
☆37Updated last month
Aleph-Alpha / trigrams
☆44Updated 2 months ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆18Updated this week
frankxwang / dpo-prefix-sharing
DPO, but faster 🚀
☆20Updated last week
joey00072 / Attention-as-graph
alternative way to calculating self attention
☆18Updated 5 months ago
VikParuchuri / classified
Score LLM pretraining data with classifiers
☆55Updated last year
mag- / gpu_benchmark
Gpu benchmark
☆43Updated last month
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆55Updated 2 months ago
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆43Updated this week