Finetune llama2-70b and codellama on MacBook Air without quantization
☆450Mar 28, 2024Updated 2 years ago
Alternatives and similar repositories for slowllama
Users that are interested in slowllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Llama 2 Everywhere (L2E)☆1,526Aug 27, 2025Updated 8 months ago
- Finetune ALL LLMs with ALL Adapeters on ALL Platforms!☆332Jul 23, 2025Updated 9 months ago
- Seamlessly integrate LLMs as Python functions☆2,406Mar 11, 2026Updated 2 months ago
- Horizon chart for CPU/GPU/Neural Engine utilization monitoring. Supports Apple M1-M4, Nvidia GPUs, AMD GPUs☆28Dec 30, 2025Updated 4 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,521Mar 4, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆495Nov 28, 2023Updated 2 years ago
- Turn expensive prompts into cheap fine-tuned models☆2,802May 25, 2024Updated last year
- Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.☆871May 4, 2026Updated 2 weeks ago
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…☆1,052Feb 27, 2025Updated last year
- An extensible, easy-to-use, and portable diffusion web UI 👨🎨☆1,670Aug 18, 2023Updated 2 years ago
- pykoi: Active learning in one unified interface☆410Sep 24, 2025Updated 7 months ago
- Simple UI for LLM Model Finetuning☆2,057Dec 21, 2023Updated 2 years ago
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆2,076Jan 20, 2026Updated 4 months ago
- Agents Capable of Self-Editing Their Prompts / Python Code☆812Mar 15, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Experimentation with Streamlit for personal LLM tool☆15Jun 19, 2023Updated 2 years ago
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆866Jan 15, 2024Updated 2 years ago
- Locust on k8s example for scalable load tests☆14Apr 16, 2022Updated 4 years ago
- Structured Outputs☆13,846Updated this week
- Count Tokens of Code (forked from gocloc)☆45Aug 19, 2024Updated last year
- ☆3,364Feb 25, 2024Updated 2 years ago
- ☆25Sep 19, 2023Updated 2 years ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,480May 1, 2025Updated last year
- AutoChain: Build lightweight, extensible, and testable LLM Agents☆1,875Dec 16, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Create and share easy-to-make, built-to-last, innovative, and customizable experiences☆33Feb 21, 2024Updated 2 years ago
- Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is al…☆112Sep 10, 2023Updated 2 years ago
- ☆614Mar 4, 2024Updated 2 years ago
- Distribute and run LLMs with a single file.☆24,451Updated this week
- Fine-tune LLM agents with online reinforcement learning☆1,251Mar 19, 2024Updated 2 years ago
- ☆1,275Oct 24, 2023Updated 2 years ago
- Examples in the MLX framework☆8,593Apr 6, 2026Updated last month
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,483Jun 7, 2025Updated 11 months ago
- 💭 Chat with AI via API☆33Oct 20, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆7,229Jul 11, 2024Updated last year
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆10,131Sep 7, 2024Updated last year
- An LLM-powered advanced RAG pipeline built from scratch☆857Jan 26, 2024Updated 2 years ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆4,083Jan 8, 2025Updated last year
- Finetune a LLM to speak like you based on your WhatsApp Conversations☆378May 5, 2024Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,921Sep 30, 2023Updated 2 years ago
- An LLM-based autonomous agent controlling real-world applications via RESTful APIs☆1,396Jun 7, 2024Updated last year