perk11 / large-model-proxy
Large Model Proxy is designed to make it easy to run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources. It listens on a dedicated port for each proxied LM, making them always available to the clients connecting to these ports.
☆45Updated last month
Related projects ⓘ
Alternatives and complementary repositories for large-model-proxy
- ☆25Updated last month
- Easily view and modify JSON datasets for large language models☆62Updated last month
- A stock market bot that automatically, once a day, rebalances your Robinhood portfolio by gathering information about each ticker in the …☆34Updated last week
- ☆95Updated last week
- HTTP proxy for on-demand model loading with llama.cpp (or other OpenAI compatible backends)☆32Updated last week
- idea: https://github.com/nyxkrage/ebook-groupchat/☆81Updated 2 months ago
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It pro…☆55Updated this week
- ☆110Updated 2 weeks ago
- LLM backed Fantasy Tribe Game☆17Updated this week
- Gradio based tool to run opensource LLM models directly from Huggingface☆87Updated 4 months ago
- SoftWhisper simplifies audio and video transcription using the powerful Whisper model. Easily select custom models, languages, and tasks,…☆31Updated last month
- A frontend for creative writing with LLMs☆106Updated 3 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 2 months ago
- A simple experiment on letting two local LLM have a conversation about anything!☆91Updated 4 months ago
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆35Updated 2 months ago
- Something similar to Apple Intelligence?☆57Updated 4 months ago
- ☆103Updated 7 months ago
- "a towel is about the most massively useful thing an interstellar AI hitchhiker can have"☆46Updated last month
- ☆30Updated 6 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆25Updated last week
- 5X faster 60% less memory QLoRA finetuning☆21Updated 5 months ago
- ☆18Updated 2 weeks ago
- Mycomind Daemon: A mycelium-inspired, advanced Mixture-of-Memory-RAG-Agents (MoMRA) cognitive assistant that combines multiple AI models …☆30Updated 3 months ago
- Complex RAG backend☆28Updated 7 months ago
- A fast batching API to serve LLM models☆172Updated 6 months ago
- This small API downloads and exposes access to NeuML's txtai-wikipedia and full wikipedia datasets, taking in a query and returning full …☆45Updated 3 months ago
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Updated 4 months ago
- ☆17Updated 2 weeks ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆39Updated last month
- Synthify: Seamlessly generate ai datasets with a no-code UI☆44Updated 3 months ago