okuvshynov / slowllamaLinks
Finetune llama2-70b and codellama on MacBook Air without quantization
☆447Updated last year
Alternatives and similar repositories for slowllama
Users that are interested in slowllama are comparing it to the libraries listed below
Sorting:
- Visualize the intermediate output of Mistral 7B☆363Updated 4 months ago
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆858Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆713Updated last year
- Agents Capable of Self-Editing Their Prompts / Python Code☆768Updated last year
- LLM Analytics☆664Updated 7 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆767Updated last week
- LLaMa retrieval plugin script using OpenAI's retrieval plugin☆323Updated 2 years ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆488Updated last year
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆285Updated this week
- DataDM is your private data assistant. Slide into your data's DMs☆385Updated 8 months ago
- Complex LLM Workflows from Simple JSON.☆301Updated last year
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆615Updated 2 months ago
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…☆1,013Updated 3 months ago
- Fast parallel LLM inference for MLX☆189Updated 10 months ago
- a small code base for training large models☆300Updated last month
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework☆339Updated 6 months ago
- Stop messing around with finicky sampling parameters and just use DRµGS!☆349Updated last year
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆682Updated 9 months ago
- llama3.np is a pure NumPy implementation for Llama 3 model.☆983Updated last month
- JS tokenizer for LLaMA 1 and 2☆352Updated 11 months ago
- LLMFlows - Simple, Explicit and Transparent LLM Apps☆695Updated 3 months ago
- An implementation of bucketMul LLM inference☆217Updated 11 months ago
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.☆377Updated last year
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆567Updated last year
- Action library for AI Agent☆214Updated 2 months ago
- Fine-tune LLM agents with online reinforcement learning☆1,191Updated last year
- Customizable implementation of the self-instruct paper.☆1,043Updated last year
- ☆744Updated last year
- ☆157Updated 10 months ago
- LLM plugin for running models using MLC☆186Updated last year