Phoenix8215 / build_neural_network_from_scratch_CPPLinks
Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.
☆9Updated 10 months ago
Alternatives and similar repositories for build_neural_network_from_scratch_CPP
Users that are interested in build_neural_network_from_scratch_CPP are comparing it to the libraries listed below
Sorting:
- learn TensorRT from scratch🥰☆15Updated 8 months ago
- This is a repository to practice multi-thread programming in C++☆24Updated last year
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆48Updated last year
- A unified and extensible pipeline for deep learning model inference with C++. Now support yolov8, yolov9, clip, and nanosam. More models …☆12Updated last year
- 使用 CUDA C++ 实现的 llama 模型推理框架☆57Updated 6 months ago
- Awesome code, projects, books, etc. related to CUDA☆17Updated last month
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆134Updated 2 years ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆60Updated last week
- An onnx-based quantitation tool.☆71Updated last year
- 本仓库在OpenVINO推理框架下部署Nanodet检测算法,并重写预处理和后处理部分,具有超高性能!让你在Intel CPU平台上的检测速度起飞! 并基于NNCF和PPQ工具将模型量化(PTQ)至int8精度,推理速度更快!☆15Updated last year
- Flash Attention in ~100 lines of CUDA (forward pass only)☆10Updated 11 months ago
- TensorRT-in-Action 是一个 GitHub 代码库,提供了使用 TensorRT 的代码示例,并有对应 Jupyter Notebook。☆16Updated 2 years ago
- 一个轻量化的大模型推理框架☆18Updated last week
- AI Infra LLM infer/ tensorrt-llm/ vllm☆20Updated 5 months ago
- C++ TensorRT Implementation of NanoSAM☆37Updated last year
- create your own llm inference server from scratch☆11Updated 6 months ago
- 一大波学习onnx的案例☆18Updated 8 months ago
- ☆24Updated last year
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆18Updated 3 weeks ago
- CUDA 6大并行计算模式 代码与笔记☆61Updated 4 years ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆49Updated last year
- For 2022 Nvidia Hackathon☆21Updated 2 years ago
- TensorRT encapsulation, learn, rewrite, practice.☆28Updated 2 years ago
- 彻底弄懂BP反向传播,15行代码,C++实现也简单,MNIST分类98.29%精度☆36Updated 3 years ago
- Llama3 Streaming Chat Sample☆22Updated last year
- cpp project template based on visual studio, OpenCV and CUDA, gdb debug, makefile☆26Updated 3 years ago
- 搜藏的希望的代码片段☆13Updated 2 years ago
- simplest online-softmax notebook for explain Flash Attention☆10Updated 5 months ago
- A light llama-like llm inference framework based on the triton kernel.☆122Updated this week
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆36Updated 2 months ago