jankais3r / LLaMA_MPS
Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
☆584Updated last year
Related projects ⓘ
Alternatives and complementary repositories for LLaMA_MPS
- MiniLLM is a minimal system for running modern LLMs on consumer-grade GPUs☆866Updated last year
- Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)☆2,559Updated last year
- LLaMa retrieval plugin script using OpenAI's retrieval plugin☆324Updated last year
- Quantized inference code for LLaMA models☆1,051Updated last year
- ☆406Updated last year
- C++ implementation for BLOOM☆811Updated last year
- A school for camelids☆1,208Updated last year
- Fork of Facebooks LLaMa model to run on CPU☆771Updated last year
- A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer☆1,618Updated last year
- ☆453Updated last year
- ☆1,426Updated last year
- Simple UI for LLM Model Finetuning☆2,046Updated 10 months ago
- ☆534Updated 11 months ago
- [NeurIPS 22] [AAAI 24] Recurrent Transformer-based long-context architecture.☆755Updated 2 weeks ago
- LLM as a Chatbot Service☆3,290Updated 11 months ago
- OpenAI-compatible Python client that can call any LLM☆366Updated last year
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆706Updated 5 months ago
- Locally run an Assistant-Tuned Chat-Style LLM☆507Updated last year
- Salesforce open-source LLMs with 8k sequence length.☆718Updated 10 months ago
- Run inference on MPT-30B using CPU☆572Updated last year
- fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…☆410Updated last year
- A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap.☆1,377Updated 2 months ago
- C++ implementation for 💫StarCoder☆445Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆2,402Updated 2 months ago
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,053Updated 8 months ago
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…☆348Updated last year
- Chat with your favourite LLaMA models in a native macOS app☆1,461Updated last year
- Inference code for LLaMA models☆189Updated last year
- Alpaca dataset from Stanford, cleaned and curated☆1,515Updated last year