pengzhao-intel / oneAPI_course
oneAPI - Data Parallel C++ course for students
☆42Updated 4 months ago
Alternatives and similar repositories for oneAPI_course:
Users that are interested in oneAPI_course are comparing it to the libraries listed below
- ☆25Updated 11 months ago
- ☆226Updated last month
- Summary of some awesome work for optimizing LLM inference☆63Updated this week
- HPC-Lab for High Performance Computing course, 2023 Spring , Tsinghua Universit. 高性能计算导论 @ THU.☆20Updated last year
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…☆92Updated 2 years ago
- Homework solutions for CMU 10-414/714 – Deep Learning Systems: Algorithms and Implementation☆43Updated 2 years ago
- This repository is established to store personal notes and annotated papers during daily research.☆112Updated last week
- performance engineering☆27Updated 8 months ago
- ☆33Updated 2 months ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆50Updated last year
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆133Updated 3 years ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆32Updated 7 months ago
- some hpc project for learning☆20Updated 6 months ago
- The dataset and baseline code for ASC23 LLM inference optimization challenge.☆34Updated last year
- Performance Prediction Toolkit for GPUs☆35Updated 2 years ago
- ☆100Updated last week
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆112Updated 8 months ago
- ☆10Updated 2 years ago
- ☆105Updated 3 months ago
- Spack package repository maintained by Student Cluster Competition Team @ Sun Yat-sen University.☆16Updated 3 months ago
- ☆35Updated 4 months ago
- easy cuda code☆66Updated 2 months ago
- ☆34Updated 8 months ago
- Benchmark Framework for Buddy Projects☆53Updated 2 weeks ago
- Codes & examples for "CUDA - From Correctness to Performance"☆86Updated 4 months ago
- My Paper Reading Lists and Notes.☆19Updated 2 months ago
- Documentation for YatCPU☆49Updated last year
- ☆36Updated last year
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆53Updated 6 months ago
- High performance Transformer implementation in C++.☆105Updated last month