mzbac / GPTQ-for-LLaMa-API
Provide a way to use the GPT-QLLama model as an API
β43Updated last year
Alternatives and similar repositories for GPTQ-for-LLaMa-API:
Users that are interested in GPTQ-for-LLaMa-API are comparing it to the libraries listed below
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChainβ43Updated last year
- A Personalised AI Assistant Inspired by 'Diamond Age, Powered by SMSβ92Updated last year
- π The open-source autonomous agent LLM initiative πβ91Updated last year
- β135Updated last year
- β37Updated last year
- A discord bot that roleplays!β148Updated last year
- oobaboga -text-generation-webui implementation of wafflecomposite - langchain-ask-pdf-localβ70Updated last year
- Build your Swarm of Internet Agents using MultiOn πβ78Updated last year
- A langchain app to visualise a debate using Tree-of-Thought reasoningβ60Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chatβ101Updated last year
- GPT-2 small trained on phi-like dataβ66Updated last year
- Creates an Langchain Agent which uses the WebUI's API and Wikipedia to workβ74Updated last year
- An Extension for oobabooga/text-generation-webuiβ36Updated last year
- Deploy your GGML models to HuggingFace Spaces with Docker and gradioβ36Updated last year
- Simple and fast server for GPTQ-quantized LLaMA inferenceβ24Updated last year
- Harnessing the Memory Power of the Camelidsβ146Updated last year
- β54Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRAβ123Updated last year
- BabyAGI-π¦: Enhanced for Llama models (running 100% local) and persistent memory, with smart internet search based on BabyCatAGI and docuβ¦β88Updated last year
- A KoboldAI-like memory extension for oobabooga's text-generation-webuiβ108Updated 6 months ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ78Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytesβ¦β146Updated last year
- β35Updated last year
- A Flask extension to manage Langchain chat memory and document stores in Flaask apps.β71Updated last year
- A backend API to perform search over Wikipedia using LangChain, Cohere and Weaviateβ105Updated last year
- Example of calling OpenRouter from a Streamit appβ95Updated last year
- Model REVOLVER, a human in the loop model mixing system.β33Updated last year
- An implementation of long term memory and external tools for LLMsβ73Updated 2 years ago
- Load local LLMs effortlessly in a Jupyter notebook for testing purposes alongside Langchain or other agents. Contains Oobagooga and Kobolβ¦β213Updated last year
- 100% Private & Simple. OSS π Code Interpreter for LLMs π¦β35Updated last year