Tlntin / Qwen-TensorRT-LLMLinks

☆616

Alternatives and similar repositories for Qwen-TensorRT-LLM

Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below

Sorting:

zhaohb / fastapi_tritonserver
☆27Updated 9 months ago
alipay / PainlessInferenceAcceleration
Accelerate inference without tears
☆322Updated 4 months ago
mindspore-lab / mindformers
☆172Updated this week
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆301Updated 6 months ago
luchangli03 / export_llama_to_onnx
export llama to onnx
☆131Updated 7 months ago
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆263Updated last week
Joyce94 / LLM-RLHF-Tuning
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
☆429Updated last year
owenliang / qwen-vllm
通义千问VLLM推理部署DEMO
☆595Updated last year
sunkx109 / llama
Inference code for LLaMA models
☆122Updated last year
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆830Updated last week
FlagOpen / FlagScale
FlagScale is a large model toolkit based on open-sourced projects.
☆336Updated this week
ninehills / llm-inference-benchmark
LLM Inference benchmark
☆424Updated last year
Tencent / KsanaLLM
☆477Updated this week
Glanvery / LLM-Travel
欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
☆331Updated last year
ModelTC / LightCompress
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…
☆528Updated last week
tpoisonooo / llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
☆363Updated 2 years ago
Tlntin / qwen-ascend-llm
☆50Updated 9 months ago
jiahe7ay / MINI_LLM
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
☆460Updated 3 months ago
DeepLink-org / dlinfer
☆52Updated this week
alibaba / Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
☆659Updated last year
SmartFlowAI / LLM101n-CN
LLM101n: Let's build a Storyteller 中文版
☆132Updated 11 months ago
pandada8 / llm-inference-benchmark
LLM 推理服务性能测试
☆44Updated last year
vllm-project / vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
☆946Updated last week
MegEngine / InferLLM
a lightweight LLM model inference framework
☆734Updated last year
wangzhaode / mnn-llm
llm deploy project based mnn. This project has merged into MNN.
☆1,601Updated 6 months ago
billvsme / my_openai_api
部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ，实现了OpenAI中Chat, Models和Completions接口，包含流式响…
☆94Updated last year
OpenLLMAI / OpenLLMWiki
OpenLLMWiki: Docs of OpenLLMAI. Survey, reproduction and domain/task adaptation of open source chatgpt alternatives/implementations. PiXi…
☆260Updated 7 months ago
hyperai / vllm-cn
vLLM Documentation in Chinese Simplified / vLLM 中文文档
☆90Updated 2 months ago
alibaba / Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
☆1,269Updated last week