Fuzzy-Search / realtime-bakllava
llama.cpp with BakLLaVA model describes what does it see
☆384Updated last year
Alternatives and similar repositories for realtime-bakllava:
Users that are interested in realtime-bakllava are comparing it to the libraries listed below
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆164Updated last year
- Run inference on replit-3B code instruct model using CPU☆154Updated last year
- ☆279Updated 7 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆224Updated 10 months ago
- ☆136Updated last year
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆245Updated last month
- A multimodal, function calling powered LLM webui.☆215Updated 5 months ago
- LLaVA server (llama.cpp).☆178Updated last year
- Fine tune SDXL on YouTube videos☆175Updated 7 months ago
- An Autonomous LLM Agent that runs on Wizcoder-15B☆336Updated 5 months ago
- Bespoke Automata is a GUI and deployment pipline for making complex AI agents locally and offline☆222Updated 9 months ago
- AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more☆288Updated 8 months ago
- Fluid Database☆114Updated 6 months ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆232Updated 2 weeks ago
- ☆708Updated last year
- ☆77Updated last year
- function calling-based LLM agents☆283Updated 6 months ago
- A fast batching API to serve LLM models☆182Updated 10 months ago
- Chrome extension to chat with page using local LLM (llama, mistral 7B, etc)☆174Updated last year
- ☆201Updated 9 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆545Updated last month
- FastMLX is a high performance production ready API to host MLX models.☆272Updated 2 weeks ago
- Scripts to create your own moe models using mlx☆89Updated last year
- run paligemma in real time☆131Updated 10 months ago
- Local semantic search. Stupidly simple.☆416Updated 8 months ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- Edge full-stack LLM platform. Written in Rust☆377Updated 9 months ago
- The open-source implementation of Q*, achieved in context as a zero-shot reprogramming of the attention mechanism. (synthetic data)Updated 3 months ago
- TuneAI or "autoFinetune" is an effortless way to fine tune an OpenAI model based on YouTube or text input. Automating transcript cleaning…☆241Updated last year
- Harnessing the Memory Power of the Camelids☆146Updated last year