Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm
☆35Aug 20, 2019Updated 6 years ago
Alternatives and similar repositories for cublasHgemm-P100
Users that are interested in cublasHgemm-P100 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- C++ CPU inference library for Tensorflow object detection models based on the lightweight Tensorflow C-API.☆15Jun 26, 2018Updated 7 years ago
- Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations☆16Aug 19, 2022Updated 3 years ago
- Inspired by the neural style algorithm in the computer vision field, we propose a high-level language model with the aim of adapting the …☆18Nov 20, 2022Updated 3 years ago
- outline and links for PLDI 2022 tutorial☆17Jun 13, 2022Updated 3 years ago
- TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です☆14Jan 28, 2019Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- An example of how to communicate to a service class threw a Binder.☆10Aug 12, 2015Updated 10 years ago
- Web上に公開されている小説をスクレイピングして青空文庫形式のテキストにする☆19Feb 9, 2017Updated 9 years ago
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- TensorFlow implementation of "A Hybrid Convolutional Variational Autoencoder for Text Generation"☆17Sep 16, 2019Updated 6 years ago
- Kaldi Snapshot☆31Mar 13, 2013Updated 13 years ago
- This is an example of a boolean expression editor made in Dear ImGui☆15Dec 3, 2022Updated 3 years ago
- A copy of the DirectX Headers from MinGW-64.☆14Sep 7, 2023Updated 2 years ago
- Attempting to implement VXGI for NCCA Masterclass assignment☆15Mar 21, 2018Updated 8 years ago
- YOLOv3-training-prune☆58Mar 9, 2021Updated 5 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- PlayStation1 MDEC compression tools☆11Dec 31, 2020Updated 5 years ago
- A GPU particle demo with fluid motion simulated through curl noise.☆15Jun 21, 2012Updated 13 years ago
- Provides a vendored libjxl.☆16Oct 13, 2022Updated 3 years ago
- Sparse Matrix-Matrix Multiplication Benchmark on Intel Xeon and Xeon Phi (KNC, KNL) from blog post:☆12Sep 25, 2016Updated 9 years ago
- Luthier, a GPU binary instrumentation tool for AMD GPUs☆28Updated this week
- Precomputed Radiance Transfert (PRT)☆14Oct 7, 2017Updated 8 years ago
- 以【电商购物支付】作为当前分布式项目的业务功能,通过该项目完整实现并解决分布式服务下的【分布式事务】问题☆17Apr 29, 2018Updated 7 years ago
- Fast Lossless Color Image Compression Library☆10Jun 21, 2022Updated 3 years ago
- 无需网页验证码登录,上海理工大学学生绩点自动查询☆10Apr 21, 2015Updated 10 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A WebGL 2 waterfall simulation from a simple particles system.☆11Feb 22, 2018Updated 8 years ago
- Implementation of The One Hundred Layers Tiramisu for semantic segmentation in Keras☆10Oct 23, 2018Updated 7 years ago
- implementation of relationNet naive version☆12Dec 4, 2017Updated 8 years ago
- USD build script for aarch64 target☆11Oct 28, 2022Updated 3 years ago
- An implementation of our CVPR 2018 work 'Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning'☆43Jul 12, 2019Updated 6 years ago
- RaNNC is an automatic parallelization middleware used to train very large-scale neural networks.☆57Oct 15, 2022Updated 3 years ago
- A C++ library for principal component analysis☆12Feb 23, 2020Updated 6 years ago
- Rust implementation of k-d tree to efficiently perform color quantization to predefined sets☆13Feb 14, 2018Updated 8 years ago
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆15Aug 14, 2022Updated 3 years ago
- Character Level Generative Adversarial Network☆23Sep 13, 2016Updated 9 years ago
- Investigations into simplified holdem poker☆12Oct 17, 2012Updated 13 years ago
- High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.☆537Sep 23, 2022Updated 3 years ago
- ☆13Nov 7, 2021Updated 4 years ago
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago
- Tensorflow implemention of various GAN.☆11Mar 14, 2020Updated 6 years ago