jankais3r / LLaMA_MPS
View external linksLinks

Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.

☆585

Alternatives and similar repositories for LLaMA_MPS

Users that are interested in LLaMA_MPS are comparing it to the libraries listed below

Sorting:

NouamaneTazi / bloomz.cpp
View on GitHub
C++ implementation for BLOOM
☆809May 13, 2023Updated 2 years ago
apple / ml-ane-transformers
View on GitHub
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
☆2,673Apr 25, 2023Updated 2 years ago
antimatter15 / alpaca.cpp
View on GitHub
Locally run an Instruction-Tuned Chat-Style LLM
☆10,186Apr 19, 2023Updated 2 years ago
tloen / alpaca-lora
View on GitHub
Instruct-tune LLaMA on consumer hardware
☆18,978Jul 29, 2024Updated last year
FMInference / FlexLLMGen
View on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,384Oct 28, 2024Updated last year
tatsu-lab / stanford_alpaca
View on GitHub
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,265Jul 17, 2024Updated last year
Lightning-AI / lit-llama
View on GitHub
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,087Jul 1, 2025Updated 7 months ago
tloen / llama-int8
View on GitHub
Quantized inference code for LLaMA models
☆1,044Mar 17, 2023Updated 2 years ago
deep-diver / LLM-As-Chatbot
View on GitHub
LLM as a Chatbot Service
☆3,332Nov 20, 2023Updated 2 years ago
qwopqwop200 / GPTQ-for-LLaMa
View on GitHub
4 bits quantization of LLaMA using GPTQ
☆3,074Jul 13, 2024Updated last year
sahil280114 / codealpaca
View on GitHub
☆1,505May 12, 2023Updated 2 years ago
ghgoodreau / flux-experimental
View on GitHub
LLM Power Tool, with experimental features such as audio transcription from ElevenLabs.
☆15Jan 25, 2024Updated 2 years ago
ggml-org / llama.cpp
View on GitHub
LLM inference in C/C++
☆94,823Updated this week
OpenGVLab / LLaMA-Adapter
View on GitHub
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,936Mar 14, 2024Updated last year
apple / ml-stable-diffusion
View on GitHub
Stable Diffusion with Core ML on Apple Silicon
☆17,790Jul 3, 2025Updated 7 months ago
cocktailpeanut / dalai
View on GitHub
The simplest way to run LLaMA on your local machine
☆12,993Jun 18, 2024Updated last year
shawwn / llama-dl
View on GitHub
High-speed download of LLaMA, Facebook's 65B parameter GPT model
☆4,145Jun 28, 2023Updated 2 years ago
ggml-org / ggml
View on GitHub
Tensor library for machine learning
☆13,923Updated this week
teknium1 / GPTeacher
View on GitHub
A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
☆1,630Sep 15, 2023Updated 2 years ago
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,402Jun 2, 2025Updated 8 months ago
Beomi / transformers-language-modeling
View on GitHub
Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3
☆23May 20, 2021Updated 4 years ago
praveen-palanisamy / webgym
View on GitHub
WebGym: Web-browser-based tasks for RL Agents
☆23Feb 4, 2021Updated 5 years ago
Saw-mon-and-Natalie / vscode-evm-toolkit
View on GitHub
EVM Toolkit language support for Visual Studio Code
☆18Sep 10, 2022Updated 3 years ago
philipturner / metal-flash-attention
View on GitHub
FlashAttention (Metal Port)
☆580Sep 22, 2024Updated last year
cmp-nct / ggllm.cpp
View on GitHub
Falcon LLM ggml framework with CPU and GPU support
☆249Jan 22, 2024Updated 2 years ago
BlinkDL / RWKV-LM
View on GitHub
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,351Updated this week
mlc-ai / mlc-llm
View on GitHub
Universal LLM Deployment Engine with ML Compilation
☆22,012Updated this week
rustformers / llm
View on GitHub
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
☆6,146Jun 24, 2024Updated last year
zphang / minimal-llama
View on GitHub
☆457Oct 15, 2023Updated 2 years ago
KyujinHan / Korean_selenium_DeepL
View on GitHub
DeepL을 통한 한국 번역 자동화 코드
☆12Jul 27, 2023Updated 2 years ago
pointnetwork / point-alpaca
View on GitHub
☆404Mar 22, 2023Updated 2 years ago
ml-explore / mlx
View on GitHub
MLX: An array framework for Apple silicon
☆23,918Updated this week
songys / single_turn_dialogue
View on GitHub
사전에서 대화 예문만 추출한 데이터
☆16Apr 24, 2023Updated 2 years ago
aaron-wheeler / MarketGPT
View on GitHub
MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series
☆17Sep 5, 2025Updated 5 months ago
artidoro / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,835Jun 10, 2024Updated last year
mlc-ai / web-llm
View on GitHub
High-performance In-browser LLM Inference Engine
☆17,258Updated this week
turboderp / exllama
View on GitHub
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,911Sep 30, 2023Updated 2 years ago
gbt42 / redactAI
View on GitHub
Private inference over your sensitive data with off-the-shelf models
☆35Apr 26, 2023Updated 2 years ago
run-llama / llama_index
View on GitHub
LlamaIndex is the leading framework for building LLM-powered agents over your data.
☆46,977Updated this week

jankais3r / LLaMA_MPSView external linksLinks

Alternatives and similar repositories for LLaMA_MPS

jankais3r / LLaMA_MPS
View external linksLinks