dianhsu / transformer-cpp-cpuLinks

用C++实现一个简单的Transformer模型。 Attention Is All You Need。

☆52

Alternatives and similar repositories for transformer-cpp-cpu

Users that are interested in transformer-cpp-cpu are comparing it to the libraries listed below

Sorting:

dianhsu / swin-transformer-cpp
Swin Transformer C++ Implementation
☆64Updated 4 years ago
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆76Updated 9 months ago
JieRen98 / SGEMM-SASS-Annotation
☆21Updated 4 years ago
sunkx109 / llama.cpp
llama 2 Inference
☆43Updated 2 years ago
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆62Updated last year
mrzhuzhe / riven
CPU Memory Compiler and Parallel programing
☆26Updated last year
zjhellofss / KuiperCourse
b站上的课程
☆78Updated 2 years ago
OpenPPL / ppl.kernel.cuda
☆38Updated last year
Syencil / Programming_Massively_Parallel_Processors
CUDA 6大并行计算模式代码与笔记
☆61Updated 5 years ago
CalvinXKY / BasicCUDA
A tutorial for CUDA&PyTorch
☆165Updated 10 months ago
HuPengsheet / EasyNN
EasyNN是一个面向教学而开发的神经网络推理框架，旨在让大家0基础也能自主完成推理框架编写！
☆33Updated last year
Cambricon / mlu-ops
Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
☆140Updated last week
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆134Updated 2 years ago
weishengying / cute_gemm
☆20Updated last year
AyakaGEMM / Hands-on-GEMM
☆143Updated last year
piDack / The-ans-for-Programming-Massively-Parallel-Processor
大规模并行处理器编程实战第二版答案
☆33Updated 3 years ago
OpenPPL / ppl.pmx
☆60Updated last year
weishengying / tiny-flash-attention
使用 cutlass 实现 flash-attention 精简版，具有教学意义
☆50Updated last year
OpenPPL / ppl.llm.kernel.cuda
☆152Updated 10 months ago
zpye / SimpleInfer
A simple neural network inference framework
☆25Updated 2 years ago
li199603 / sgemm_with_cuda
SGEMM optimization with cuda step by step
☆20Updated last year
Bruce-Lee-LY / memory_pool
Simple and efficient memory pool is implemented with C++11.
☆10Updated 3 years ago
njuhope / cuda_sgemm
☆116Updated last year
luchangli03 / export_llama_to_onnx
export llama to onnx
☆137Updated 10 months ago
OpenPPL / ppl.kernel.cpu
☆19Updated last year
prajna-lang / prajna
a simple general program language
☆99Updated 2 months ago
xxxxyu / FlexNN
Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"
☆56Updated 10 months ago
StudyingLover / ggml-tutorial
☆34Updated last year
Bruce-Lee-LY / cuda_hgemv
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
☆69Updated last year
BBuf / how-to-optimize-gemm
☆98Updated 4 years ago