huggingface / text-generation-inferenceLinks

Large Language Model Text Generation Inference

☆10,605

Alternatives and similar repositories for text-generation-inference

Users that are interested in text-generation-inference are comparing it to the libraries listed below

Sorting:

AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,977Updated 6 months ago
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,094Updated last year
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,719Updated last year
skypilot-org / skypilot
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, o…
☆8,884Updated this week
axolotl-ai-cloud / axolotl
Go ahead and axolotl questions
☆10,673Updated this week
abetlen / llama-cpp-python
Python bindings for llama.cpp
☆9,678Updated 2 months ago
turboderp-org / exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
☆4,353Updated 2 months ago
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,523Updated 2 years ago
bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,687Updated last week
huggingface / text-embeddings-inference
A blazing fast inference solution for text embeddings models
☆4,131Updated 3 weeks ago
meta-pytorch / torchtune
PyTorch native post-training library
☆5,564Updated this week
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,834Updated 10 months ago
nlpxucan / WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,459Updated 4 months ago
huggingface / trl
Train transformer language models with reinforcement learning.
☆16,012Updated last week
ggml-org / ggml
Tensor library for machine learning
☆13,332Updated last week
bentoml / OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
☆11,883Updated this week
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,412Updated this week
EleutherAI / lm-evaluation-harness
A framework for few-shot evaluation of language models.
☆10,488Updated this week
guidance-ai / guidance
A guidance language for controlling large language models.
☆20,882Updated 2 weeks ago
huggingface / alignment-handbook
Robust recipes to align language models with human and AI preferences
☆5,406Updated last month
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,082Updated 3 months ago
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆19,900Updated last week
mistralai / mistral-inference
Official inference library for Mistral models
☆10,521Updated 7 months ago
turboderp / exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,903Updated 2 years ago
sgl-project / sglang
SGLang is a fast serving framework for large language models and vision language models.
☆19,462Updated this week
bigcode-project / starcoder
Home of StarCoder: fine-tuning & inference!
☆7,466Updated last year
microsoft / LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
☆4,157Updated 4 months ago
NVIDIA / TensorRT-LLM
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…
☆11,955Updated last week
InternLM / lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
☆7,199Updated last week
Lightning-AI / litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
☆12,861Updated last week