owenliang / mnist-onnx-runtimeLinks
MoE model with onnx runtime
☆44Updated last year
Alternatives and similar repositories for mnist-onnx-runtime
Users that are interested in mnist-onnx-runtime are comparing it to the libraries listed below
Sorting:
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆48Updated last year
- LLM Tokenizer with BPE algorithm☆31Updated last year
- LLM101n: Let's build a Storyteller 中文版☆131Updated 9 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- run ChatGLM2-6B in BM1684X☆49Updated last year
- Inference code for LLaMA models☆121Updated last year
- ☆120Updated 2 years ago
- LLM 推理服务性能测试☆40Updated last year
- 人工智能培训课件资源☆97Updated this week
- qwen models finetuning☆98Updated 2 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆77Updated 3 weeks ago
- 帮助新手快速入门、快速使用、习惯 OpenMMLab 开源库官方文档且能够自主上手实验,自由选择阅读更深层的知识。☆63Updated 2 years ago
- 从零到一实现一个 miniLLM~(动手学习LLM)☆70Updated last year
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆62Updated last year
- ☆41Updated 2 months ago
- 《自然语言处理:大模型理论与实践》配套数据和代码☆63Updated 5 months ago
- ☆43Updated 9 months ago
- 个人总结的大模型、自然语言处理NLP、多模态、计算机视觉CV等方向paper的阅读笔记;收集到或者使用到的一些NLP、CV等领域的优秀开源仓库;其他:如数据集、评测leaderboard等☆48Updated this week
- unify-easy-llm(ULM)旨在打造一个简易的一键式大模型训练工具,支持Nvidia GPU、Ascend NPU等不同硬件以及常用的大模型。☆55Updated 10 months ago
- yolo master 本课程主要对yolo系列模型进行介绍,包括各版本模型的结构,进行的改进等,旨在帮助学习者们可以了解和掌握主要yolo模型的发展脉络,以期在各自的应用领域可以进一步创新并在自己的任务上达到较好的效果。☆113Updated 2 months ago
- 大模型/LLM推理和部署理论与实践☆269Updated 2 months ago
- 通义千问的DPO训练☆48Updated 8 months ago
- from MHA, MQA, GQA to MLA by 苏剑林, with code☆19Updated 3 months ago
- DeepSpeed Tutorial☆97Updated 9 months ago
- ☆132Updated 3 months ago
- 模型压缩的小白入门教程☆22Updated 11 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆49Updated last year
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆96Updated last year
- ggml学习笔记,ggml是一个机器学习的推理框架☆15Updated last year
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆49Updated 2 years ago