vtuber-plan / olah
Self-hosted huggingface mirror service.
☆103Updated last week
Alternatives and similar repositories for olah:
Users that are interested in olah are comparing it to the libraries listed below
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆25Updated 2 weeks ago
- LM inference server implementation based on *.cpp.☆66Updated this week
- Open Source Text Embedding Models with OpenAI Compatible API☆142Updated 6 months ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆248Updated last year
- ☆153Updated last week
- Self-hosted LLM chatbot arena, with yourself as the only judge☆36Updated 11 months ago
- LLM steganography with minimum-entropy coupling - Hiding encrypted messages in natural language.☆78Updated 4 months ago
- ☆52Updated 7 months ago
- Sentence Transformers API: An OpenAI compatible embedding API server☆41Updated 4 months ago
- An OpenAI Completions API compatible server for NLP transformers models☆62Updated last year
- Self-host LLMs with vLLM and BentoML☆79Updated last week
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆60Updated 9 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆82Updated 3 weeks ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆113Updated 8 months ago
- Benchmarking suite for popular AI APIs☆80Updated 2 months ago
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆74Updated 2 weeks ago
- ☆52Updated 3 weeks ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆184Updated last month
- a local implementation of OpenAI Assistants API: myla stands for MY Local Assistant☆50Updated 4 months ago
- Formatron empowers everyone to control the format of language models' output with minimal overhead.☆177Updated 3 weeks ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆78Updated 9 months ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆94Updated this week
- A third-party component library based on Gradio.☆73Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆187Updated 5 months ago
- ☆132Updated 11 months ago
- ☆25Updated last week
- Clone of https://r.jina.ai which is deployable locally☆34Updated 4 months ago
- automatically quant GGUF models☆151Updated this week
- Comparison of Language Model Inference Engines☆203Updated last month
- Evaling and unaligning Chinese LLM censorship☆46Updated 4 months ago