csce585-mlsystems / project-athena
This is the course project for CSCE585: ML Systems. Students will build their machine learning systems based on the provided infrastructure --- Athena.
☆13Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for project-athena
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- An Attention Superoptimizer☆20Updated 6 months ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆10Updated 2 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆19Updated last year
- Collection of Papers and Trials on Deep Learning to aid EE design☆38Updated 4 years ago
- A minimalistic header only C++11 Neural Network library based on Eigen::Tensor☆20Updated 6 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆13Updated 5 years ago
- An external memory allocator example for PyTorch.☆13Updated 3 years ago
- ☆11Updated 3 years ago
- ☆12Updated 4 years ago
- An implementation of a BinaryConnect network for cifar10☆11Updated 5 years ago
- An efficient concurrent graph processing system☆46Updated 3 years ago
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆17Updated 2 years ago
- AutoCAT: Reinforcement Learning for Automated Exploration of Cache-Timing Attacks☆43Updated last year
- ☆17Updated 3 years ago
- General system research material (not limited to paper) reading notes.☆20Updated 3 years ago
- Development repository for integrating FlexFlow (A distributed deep learning framework that supports flexible parallelization strategies)…☆28Updated 3 years ago
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆22Updated this week
- Simulated Annealing to minimize the wirelength☆8Updated 7 years ago
- Some microbenchmarks and design docs before commencement☆12Updated 3 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Updated 5 years ago
- ☆16Updated 4 years ago
- hardware test for CPU,GPU,I/O,memory bandwidth performance☆25Updated 6 years ago
- ☆22Updated 3 years ago
- ☆29Updated 2 years ago
- This is a repo which contains some details about how to use OpenCL backend (Xilinx/Intel).☆24Updated 5 years ago
- NeuroVectorizer is a framework that uses deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas for for loops…☆91Updated last year
- Accelerator simulation framework using nn_dataflow traces and energy, etc. post-processing☆7Updated 5 years ago