shawwn / openai-serverLinks
OpenAI API webserver
β188Updated 3 years ago
Alternatives and similar repositories for openai-server
Users that are interested in openai-server are comparing it to the libraries listed below
Sorting:
- Drop in replacement for OpenAI, but with Open models.β152Updated 2 years ago
- π¬ Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals clientβ313Updated last year
- fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backeβ¦β409Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRAβ123Updated 2 years ago
- β405Updated 2 years ago
- Inference code for LLaMA modelsβ188Updated 2 years ago
- SoTA Transformers with C-backend for fast inference on your CPU.β309Updated last year
- β130Updated 3 years ago
- LLaMa retrieval plugin script using OpenAI's retrieval pluginβ324Updated 2 years ago
- Extend the original llama.cpp repo to support redpajama model.β118Updated 10 months ago
- Reimplementation of the task generation part from the Alpaca paperβ119Updated 2 years ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge promptsβ110Updated last year
- howdoi.aiβ255Updated 2 years ago
- An easy way to host your own AI API and expose alternative models, while being compatible with "open" AI clients.β331Updated last year
- A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loadβ¦β114Updated 3 years ago
- Embeddings focused small version of Llama NLP modelβ103Updated 2 years ago
- A Simple Discord Bot for the Alpaca LLMβ101Updated 2 years ago
- Falcon LLM ggml framework with CPU and GPU supportβ246Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chatβ101Updated 2 years ago
- Framework agnostic python runtime for RWKV modelsβ147Updated last year
- LLaMA Cog templateβ307Updated last year
- 4 bits quantization of SantaCoder using GPTQβ51Updated 2 years ago
- Inference code for facebook LLaMA models with Wrapyfi supportβ129Updated 2 years ago
- OpenAI-compatible Python client that can call any LLMβ372Updated 2 years ago
- Simple Annotated implementation of GPT-NeoX in PyTorchβ110Updated 2 years ago
- Python bindings for llama.cppβ197Updated 2 years ago
- C++ implementation for BLOOMβ810Updated 2 years ago
- C++ implementation for π«StarCoderβ455Updated last year
- Prompt programming with FMs.β443Updated 11 months ago
- Command-line script for inferencing from models such as falcon-7b-instructβ75Updated 2 years ago