c0sogi / llama-api
An OpenAI-like LLaMA inference API
☆112Updated last year
Alternatives and similar repositories for llama-api:
Users that are interested in llama-api are comparing it to the libraries listed below
- Use local llama LLM or openai to chat, discuss/summarize your documents, youtube videos, and so on.☆152Updated 3 months ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆147Updated last year
- A OpenAI API compatible REST server for llama.☆205Updated last month
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆113Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- ☆38Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆78Updated last year
- A fast batching API to serve LLM models☆182Updated 11 months ago
- ☆153Updated 9 months ago
- Python bindings for the C++ port of GPT4All-J model.☆38Updated last year
- The code we currently use to fine-tune models.☆114Updated 11 months ago
- This repo is for handling Question Answering, especially for Multi-hop Question Answering☆67Updated last year
- A multimodal, function calling powered LLM webui.☆214Updated 6 months ago
- Local LLM ReAct Agent with Guidance☆158Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated 11 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆86Updated last month
- oobaboga -text-generation-webui implementation of wafflecomposite - langchain-ask-pdf-local☆70Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆159Updated last year
- ☆277Updated last year
- Visual Studio Code extension for WizardCoder☆148Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated 11 months ago
- Provide a way to use the GPT-QLLama model as an API☆43Updated last year
- Host the GPTQ model using AutoGPTQ as an API that is compatible with text generation UI API.☆91Updated last year
- Simple and fast server for GPTQ-quantized LLaMA inference☆24Updated last year
- Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).☆99Updated 5 months ago
- Merge Transformers language models by use of gradient parameters.☆206Updated 8 months ago
- ☆39Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆37Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated 11 months ago