Tlntin / Qwen-TensorRT-LLMLinks
☆625Updated last year
Alternatives and similar repositories for Qwen-TensorRT-LLM
Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below
Sorting:
- ☆27Updated 11 months ago
- Accelerate inference without tears☆364Updated 2 weeks ago
- ☆175Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆267Updated 2 months ago
- llm-export can export llm model to onnx.☆317Updated last week
- ☆90Updated 2 years ago
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆903Updated last week
- ☆508Updated last month
- export llama to onnx☆136Updated 10 months ago
- LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)☆436Updated 2 years ago
- 通义千问VLLM推理部署DEMO☆614Updated last year
- Inference code for LLaMA models☆127Updated 2 years ago
- LLM Inference benchmark☆428Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆138Updated 10 months ago
- Best practice for training LLaMA models in Megatron-LM☆659Updated last year
- Optimize QWen1.5 models with TensorRT-LLM☆17Updated last year
- ☆50Updated last year
- LLM 推理服务性能测试☆44Updated last year
- FlagScale is a large model toolkit based on open-sourced projects.☆364Updated last week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆347Updated last year
- 使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。☆120Updated 2 years ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆444Updated last month
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆97Updated this week
- Yuan 2.0 Large Language Model☆689Updated last year
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago
- CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NL…☆391Updated 2 years ago
- C++ implementation of Qwen-LM☆605Updated 10 months ago
- ☆129Updated 10 months ago
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆479Updated last year
- FlagPerf is an open-source software platform for benchmarking AI chips.☆352Updated 2 weeks ago