Deep-Spark / DeepSparkInference
DeepSparkInference has selected 48 inference model examples, covering fields such as computer vision, natural language processing, and speech recognition. Subsequent phases will gradually expand to more AI fields.
☆17Updated this week
Alternatives and similar repositories for DeepSparkInference:
Users that are interested in DeepSparkInference are comparing it to the libraries listed below
- The DeepSpark open platform selects hundreds of open source application algorithms and models that are deeply coupled with industrial app…☆41Updated last week
- DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…☆63Updated last week
- ☆58Updated 4 months ago
- simplify >2GB large onnx model☆54Updated 4 months ago
- ☆71Updated 2 years ago
- Large Language Model Onnx Inference Framework☆32Updated 2 months ago
- ☆36Updated 5 months ago
- This repository contains the Open Source Software components of the Iluvatar Corex IxRT. It includes the sources for IxRT plugins and dep…☆14Updated last week
- ☆98Updated 3 years ago
- ☆79Updated 4 years ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆46Updated last year
- ☆11Updated last year
- Transformer related optimization, including BERT, GPT☆17Updated last year
- ☆139Updated 11 months ago
- Triton Inferece Server Model Config and Client Scripts☆32Updated 3 years ago
- autoTVM神经网络推理代码优化搜索演示,基于tvm编译开源模型centerface,并使用autoTVM搜索最优推理代码, 最终部署编译为c++代码,演示平台是cuda,可以是其他平台,例如树莓派,安卓手机,苹果手机.Thi is a demonstration of …☆27Updated 3 years ago
- caffe model to onnx☆33Updated 2 years ago
- ☆24Updated last year
- ☆124Updated last year
- Whisper in TensorRT-LLM☆15Updated last year
- ☆12Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆22Updated last year
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆38Updated 7 months ago
- ☆26Updated this week
- export llama to onnx☆118Updated 3 months ago
- ☆16Updated last year
- python wrapper of ncnn with pybind11☆72Updated 4 years ago
- Compare multiple optimization methods on triton to imporve model service performance☆50Updated last year
- Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing …☆46Updated 6 months ago