baidu-research / catamount
Catamount is a compute graph analysis tool to load, construct, and modify deep learning models and to symbolically analyze their compute requirements
☆13Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for catamount
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- ☆10Updated 2 years ago
- An ONNX backend using PlaidML☆28Updated 6 years ago
- A CUDA implementation of the Tsetlin Machine based on bitwise operators☆26Updated 5 years ago
- Python bindings for libNVVM☆37Updated 10 years ago
- ☆14Updated 5 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆16Updated 4 years ago
- Input-aware cuBLAS/clBLAS implementation for better performance☆17Updated 2 years ago
- ☆15Updated 6 years ago
- Automatic Differentiation for Tensor Algebras☆28Updated 6 years ago
- Alchemist: an Apache Spark<->MPI interface☆26Updated 6 years ago
- GPU Automatically Tuned Linear Algebra Software☆28Updated 9 years ago
- A visualization tool to show a TensorFlow's graph like TensorBoard☆45Updated 3 years ago
- A platform for online learning that curtails data latency and saves you cost.☆47Updated 2 years ago
- nGraph™ Backend for ONNX☆42Updated last year
- Automatic differentiation for NumPy☆42Updated 11 years ago
- Code examples for CUDA and OpenACC☆34Updated 3 months ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- Distributed Learning by Pair-Wise Averaging☆53Updated 7 years ago
- A collection of example workloads for Parallel JavaScript☆26Updated last year
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Build-to-Order BLAS☆11Updated 5 years ago
- Library for exact linear algebra, a C++ template-library based originally on LinBox intended for F4-like implementations☆16Updated 11 years ago
- stage the upgrade of hcc-clang to clang ToT☆11Updated 4 years ago
- Python Binding to NVRTC☆79Updated last month
- Scientific library for high-precision computations and research☆50Updated 7 years ago
- NVIDIA Compute Unified Device Architecture Toolkit☆14Updated 2 months ago
- Fork of magma to include more BLAS☆28Updated 7 years ago
- Enable Polyhedral JIT compilation☆9Updated 6 years ago
- npcomp - An aspirational MLIR based numpy compiler☆51Updated 4 years ago