This is a c++ implementation of an LSTM Neural Network parallelized for a GPU using CUDA
☆25Oct 29, 2017Updated 8 years ago
Alternatives and similar repositories for lstm-cuda
Users that are interested in lstm-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GPU/CPU (CUDA) Implementation of "Recurrent Memory Array Structures", Simple RNN, LSTM, Array LSTM..☆26Feb 28, 2020Updated 6 years ago
- Machine Learning Tool with LSTM and MLP Neural Networks, with CUDA implementation.☆31Oct 12, 2017Updated 8 years ago
- Convolutional Neural Network of vgg19 model using Cuda to accelerate☆12Jun 11, 2018Updated 7 years ago
- Matrix Multiplication on GPU using Shared Memory considering Coalescing and Bank Conflicts☆26Aug 29, 2022Updated 3 years ago
- Drop-in library for tracking the memory allocations of CUDA applications☆14Nov 17, 2017Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- LonestarGPU: Irregular algorithms parallelized for GPUs☆38Nov 11, 2019Updated 6 years ago
- Next word prediction based on N-gram language model☆12Jan 11, 2015Updated 11 years ago
- ☆14Sep 2, 2012Updated 13 years ago
- Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.☆35Dec 12, 2019Updated 6 years ago
- Benchmarks of Deep Neural Networks☆39May 19, 2021Updated 5 years ago
- CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.☆39Jul 19, 2017Updated 8 years ago
- Final Project for Parallel Computing at CMU (15-618/15-418)☆10May 13, 2016Updated 10 years ago
- LOUDS-trie implementation example (C++)☆15Nov 27, 2019Updated 6 years ago
- Python Tools for the POP Metrics☆13Feb 16, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Deep neural network framework (C/C++/CUDA).☆32Aug 11, 2015Updated 10 years ago
- HCC Sample Applications☆13Jan 3, 2017Updated 9 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- Simple Arm assembly kernels for testing the performance and functionality of Arm CPUs.☆16Dec 3, 2023Updated 2 years ago
- Nsight Compute In Docker☆13Dec 21, 2023Updated 2 years ago
- Profiling with NVIDIA Nsight Tools Bootcamp☆23Feb 4, 2026Updated 3 months ago
- Dark channel Haze removal algorithm with CUDA acceleration (typically 10x or more speedup using a Nvidia GPU)☆14Dec 7, 2017Updated 8 years ago
- Materials for ECS 201A☆11Oct 23, 2019Updated 6 years ago
- A library for exporting models including NeMo and Hugging Face to optimized inference backends, and deploying them for efficient querying☆37May 21, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Searching for a Strategy: Modelling Player Trajectories in Soccer Games using Social LSTM☆16Dec 20, 2017Updated 8 years ago
- [SIGGRAPH Asia 2025] The official implementation of the paper "DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinat…☆33Mar 10, 2026Updated 2 months ago
- GPU-powered stochastic MPC for drinking water networks☆16Sep 12, 2022Updated 3 years ago
- CUDA solutions for the lab assignments in the UIUC-ECE408 Applied Parallel Programming course.☆19Apr 18, 2023Updated 3 years ago
- study of cutlass☆22Nov 10, 2024Updated last year
- this is the release repository of superneurons☆54Feb 13, 2021Updated 5 years ago
- Source code for the software implementation of SeGraM proposed in our ISCA 2022 paper: Senol Cali et. al., "SeGraM: A Universal Hardware …☆12Nov 3, 2022Updated 3 years ago
- ☆74May 29, 2019Updated 7 years ago
- ☆11Nov 13, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Style-NeRF2NeRF implementation.☆14Dec 26, 2024Updated last year
- cuDNN sample codes provided by Nvidia☆47Feb 18, 2019Updated 7 years ago
- Keras Docset for dash/Zeal☆10Oct 10, 2020Updated 5 years ago
- epoll源码分析☆16Feb 17, 2016Updated 10 years ago
- 医疗数据的匿名化研究☆12Jul 20, 2015Updated 10 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago