openvino-dev-samples / Qwen2.openvinoLinks
This sample shows how to deploy Qwen2 using OpenVINO
☆38Updated 8 months ago
Alternatives and similar repositories for Qwen2.openvino
Users that are interested in Qwen2.openvino are comparing it to the libraries listed below
Sorting:
- run ChatGLM2-6B in BM1684X☆49Updated last year
- run chatglm3-6b in BM1684X☆39Updated last year
- 研究GOT-OCR-项目落地加速,不限语言☆60Updated 7 months ago
- Alpaca Chinese Dataset -- 中文指令微调数据集☆205Updated 8 months ago
- qwen2 and llama3 cpp implementation☆44Updated last year
- 大模型API性能指标比较 - 深入分析TTFT、TPS等关键指标☆17Updated 8 months ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆42Updated last year
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆79Updated last year
- 纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,☆45Updated last year
- ☆41Updated 2 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- Programming with local large language model.☆17Updated last month
- ☆90Updated last year
- Port of Facebook's LLaMA model in C/C++☆52Updated last month
- share data, prompt data , pretraining data☆36Updated last year
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆21Updated 4 months ago
- unify-easy-llm(ULM)旨在打造一个简易的一键式大模型训练工具,支持Nvidia GPU、Ascend NPU等不同硬件以及常用的大模型。☆55Updated 10 months ago
- Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory☆27Updated last year
- This is Microsoft-Phi-3-NvidiaNIMWorkshop☆22Updated 9 months ago
- GLM Series Edge Models☆142Updated 3 months ago
- baichuan and baichuan2 finetuning and alpaca finetuning☆32Updated 2 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆48Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆136Updated 6 months ago
- a simple lightweight large language model pipeline framework.☆25Updated last month
- LLM101n: Let's build a Storyteller 中文版☆131Updated 9 months ago
- Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SO…☆82Updated 4 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆72Updated 11 months ago
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆93Updated last year
- Its an open source LLM based on MOE Structure.☆58Updated 11 months ago