remixer-dec / llama-mpsLinks
Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2
☆86Updated last year
Alternatives and similar repositories for llama-mps
Users that are interested in llama-mps are comparing it to the libraries listed below
Sorting:
- LLM plugin for running models using MLC☆187Updated last year
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- Python notebook to run OpenAI's Whisper model with speaker identification☆80Updated 2 years ago
- Tiny inference-only implementation of LLaMA☆93Updated last year
- Visualize text embeddings☆40Updated last year
- ☆135Updated last year
- Array-Inspired Pipeline Language☆119Updated last year
- Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) wor…☆211Updated last year
- A web-app to explore topics using LLM (less typing and more clicks)☆68Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last year
- What if an HNSW index was just a file, and you could serve it from a CDN, and search it directly in the browser?☆106Updated 2 months ago
- LLaMa retrieval plugin script using OpenAI's retrieval plugin☆324Updated 2 years ago
- An implementation of bucketMul LLM inference☆217Updated 11 months ago
- Vector search dictionary definitions☆44Updated 2 years ago
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- Praetor is a lightweight finetuning data and prompt management tool☆67Updated 7 months ago
- Port of Facebook's LLaMA model in C/C++☆45Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated 2 years ago
- ☆40Updated 2 years ago
- Simple embedding -> text model trained on a small subset of Wikipedia sentences.☆152Updated last year
- LLM plugin for clustering embeddings☆76Updated last year
- For inferring and serving local LLMs using the MLX framework☆104Updated last year
- ☆38Updated last year
- Extensible AI assistant platform that bridges LLMs to tasks and actions☆38Updated 2 years ago
- hnsqlite integrates hnswlib and sqlite for simple text embedding search☆160Updated last year
- [Added T5 support to TRLX] A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆47Updated 2 years ago
- WebGPU LLM inference tuned by hand☆151Updated last year
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆53Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year