Tlntin / Qwen-TensorRT-LLMLinks
☆616Updated last year
Alternatives and similar repositories for Qwen-TensorRT-LLM
Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below
Sorting:
- ☆27Updated 9 months ago
- Accelerate inference without tears☆322Updated 4 months ago
- ☆172Updated this week
- llm-export can export llm model to onnx.☆301Updated 6 months ago
- export llama to onnx☆131Updated 7 months ago
- ☆90Updated 2 years ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆263Updated last week
- LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)☆429Updated last year
- 通义千问VLLM推理部署DEMO☆595Updated last year
- Inference code for LLaMA models☆122Updated last year
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆830Updated last week
- FlagScale is a large model toolkit based on open-sourced projects.☆336Updated this week
- LLM Inference benchmark☆424Updated last year
- ☆477Updated this week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆331Updated last year
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆528Updated last week
- LLaMa/RWKV onnx models, quantization and testcase☆363Updated 2 years ago
- ☆50Updated 9 months ago
- This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.☆460Updated 3 months ago
- ☆52Updated this week
- Best practice for training LLaMA models in Megatron-LM☆659Updated last year
- LLM101n: Let's build a Storyteller 中文版☆132Updated 11 months ago
- LLM 推理服务性能测试☆44Updated last year
- Community maintained hardware plugin for vLLM on Ascend☆946Updated last week
- a lightweight LLM model inference framework☆734Updated last year
- llm deploy project based mnn. This project has merged into MNN.☆1,601Updated 6 months ago
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆94Updated last year
- OpenLLMWiki: Docs of OpenLLMAI. Survey, reproduction and domain/task adaptation of open source chatgpt alternatives/implementations. PiXi…☆260Updated 7 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆90Updated 2 months ago
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,269Updated last week