xaedes / llama.cpp

Port of Facebook's LLaMA model in C/C++

☆20

Related projects ⓘ

Alternatives and complementary repositories for llama.cpp

chu-tianxiang / QuIP-for-all
QuIP quantization
☆46Updated 7 months ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated 5 months ago
zarakiquemparte / zaraki-tools
☆27Updated last year
bjj / exllamav2-openai-server
An OpenAI API compatible LLM inference server based on ExLlamaV2.
☆22Updated 9 months ago
Hellisotherpeople / llm_steer-oobabooga
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…
☆42Updated 7 months ago
austinsilveria / tricksy
Fast approximate inference on a single GPU with sparsity aware offloading
☆38Updated 10 months ago
xjdr-alt / muzero_sketch
☆36Updated 3 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆66Updated last week
Birch-san / booru-embed
[WIP] Transformer to embed Danbooru labelsets
☆13Updated 7 months ago
Algomancer / The-Daily-Train
Training Models Daily
☆17Updated 10 months ago
GreenBitAI / green-bit-llm
A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.
☆73Updated 3 weeks ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated last year
Zyphra / zcookbook
Training hybrid models for dummies.
☆15Updated 2 weeks ago
the-crypt-keeper / the-muse
Experimental sampler to make LLMs more creative
☆30Updated last year
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆54Updated 2 months ago
silphendio / sliced_llama
Simple LLM inference server
☆17Updated 5 months ago
kyegomez / SelfExtend
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Updated this week
NolanoOrg / SpectraSuite
☆43Updated 3 months ago
euclaise / supertrainer2000
☆49Updated 8 months ago
andrew-silva / mlx-rlhf
An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.
☆21Updated 4 months ago
nanowell / Q-Sparse-LLM
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆30Updated 3 months ago
lachlansneff / sparsellama
☆40Updated last year
SonicCodes / subcloning
implementation of https://arxiv.org/pdf/2312.09299
☆19Updated 4 months ago
NousResearch / StripedHyenaTrainer
☆55Updated 11 months ago
matthewrenze / jhu-concise-cot
The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models
☆20Updated 9 months ago
xjdr-alt / llmri
look how they massacred my boy
☆54Updated 3 weeks ago
cognitivecomputations / kraken
☆64Updated 5 months ago
arcee-ai / DAM
☆40Updated last week
Digitous / ModelREVOLVER
Model REVOLVER, a human in the loop model mixing system.
☆33Updated last year