qsyao / cudaBERTLinks

A Fast Muti-processing BERT-Inference System

☆101

Alternatives and similar repositories for cudaBERT

Users that are interested in cudaBERT are comparing it to the libraries listed below

Sorting:

Oneflow-Inc / OneFlow-Benchmark
OneFlow models for benchmarking.
☆104Updated 11 months ago
chncwang / InsNet
InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.
☆66Updated 3 years ago
Oneflow-Inc / DLPerf
DeepLearning Framework Performance Profiling Toolkit
☆285Updated 3 years ago
zhihu / cuBERT
Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
☆544Updated 4 years ago
volcengine / veGiantModel
☆220Updated last year
zakheav / automatic-differentiation-framework
an automatic differentiation framework with dynamic graph/支持动态图的自动求导框架
☆100Updated 5 years ago
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
bytedance / effective_transformer
Running BERT without Padding
☆472Updated 3 years ago
tvmai / meetup-slides
Place for meetup slides
☆141Updated 4 years ago
Oneflow-Inc / oneflow-documentation
oneflow documentation
☆69Updated last year
alibaba / FastNN
FastNN provides distributed training examples that use EPL.
☆83Updated 3 years ago
keithyin / read-pytorch-source-code
pytorch源码阅读 0.2.0 版本
☆90Updated 5 years ago
dlsys-course / tinyflow
Tutorial code on how to build your own Deep Learning System in 2k Lines
☆125Updated 8 years ago
Tencent / WeChat-TFCC
☆127Updated 4 years ago
LieluoboAi / radish
C++ model train&inference framework
☆223Updated 5 years ago
alibaba / EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
☆267Updated 2 years ago
HadXu / Thunder
A small deep-learning framework with C++/Python/CUDA
☆54Updated 7 years ago
Yunlong-He / TensorFlowInternals
Analyze TensorFlow source code
☆19Updated 8 years ago
Oneflow-Inc / models
Models and examples built with OneFlow
☆98Updated 9 months ago
Angel-ML / PyTorch-On-Angel
PyTorch On Angel, arming PyTorch with a powerful Parameter Server, which enable PyTorch to train very big models.
☆168Updated last month
bojone / keras_recompute
saving memory by recomputing for keras
☆37Updated 5 years ago
Tencent / deepx_core
deepx_core是一个专注于张量计算/深度学习的基础库
☆377Updated 3 months ago
bytedance / LargeBatchCTR
Large batch training of CTR models based on DeepCTR with CowClip.
☆170Updated 2 years ago
Harry-Chen / InfMoE
Inference framework for MoE layers based on TensorRT with Python binding
☆41Updated 4 years ago
DeepRec-AI / HybridBackend
A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
☆158Updated last year
triton-inference-server / hugectr_backend
☆55Updated last year
LeeJuly30 / BERTCpp
implement bert in pure c++
☆35Updated 5 years ago
dlsys-course / assignment2-2018
(Spring 2018) Assignment 2: Graph Executor with TVM
☆123Updated 7 years ago
halleywj / dlsys-solution
My solutions to the assignments of dlsys course (CSE599G1: Deep Learning System Spring 2017)
☆10Updated 8 years ago
void-main / FasterTransformer
Transformer related optimization, including BERT, GPT
☆59Updated last year