bjj / exllamav2-openai-server
An OpenAI API compatible LLM inference server based on ExLlamaV2.
☆25Updated last year
Alternatives and similar repositories for exllamav2-openai-server:
Users that are interested in exllamav2-openai-server are comparing it to the libraries listed below
- ☆27Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆43Updated last year
- All the world is a play, we are but actors in it.☆49Updated this week
- Model REVOLVER, a human in the loop model mixing system.☆33Updated last year
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- ☆22Updated 11 months ago
- Train Llama Loras Easily☆31Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆34Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- ☆38Updated last year
- ☆31Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- 5X faster 60% less memory QLoRA finetuning☆21Updated 11 months ago
- Experimental sampler to make LLMs more creative☆31Updated last year
- ☆16Updated last year
- Let's create synthetic textbooks together :)☆74Updated last year
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆36Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated last year
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 11 months ago
- GPT-2 small trained on phi-like data☆66Updated last year
- ☆20Updated last year
- ☆50Updated last year
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆22Updated last month
- Modified Beam Search with periodical restart☆12Updated 7 months ago
- Modified Stanford-Alpaca Trainer for Training Replit's Code Model☆40Updated last year
- entropix style sampling + GUI☆26Updated 6 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 5 months ago
- Full finetuning of large language models without large memory requirements☆94Updated last year