面向多平台编译优化的深度学习中间表示
☆10Oct 28, 2024Updated last year
Alternatives and similar repositories for Unified-IR
Users that are interested in Unified-IR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A torch compile backend for multi-targets☆49Apr 2, 2026Updated 2 weeks ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth☆18Aug 21, 2023Updated 2 years ago
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆471Apr 11, 2026Updated last week
- TACOS: [T]opology-[A]ware [Co]llective Algorithm [S]ynthesizer for Distributed Machine Learning☆33Jun 13, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Torch Plasma Simulator☆10Apr 5, 2026Updated last week
- libFastMesh - Optimized Finite Volume Computational Aeroacoustics (CAA) Code☆13Mar 28, 2024Updated 2 years ago
- Exploring Machine Learning methods and workflows in a simplified weather model☆19Jun 6, 2024Updated last year
- Optimize tensor program fast with Felix, a gradient descent autotuner.☆33Mar 5, 2026Updated last month
- A simple pseudo-spectral solver for the Direct Numerical Simulation (DNS) of the 3D Taylor-Green Vortex in the Julia programming language☆10Jun 6, 2022Updated 3 years ago
- Finding all the circuits of a directed graph with self-arcs and multiple-arcs by K.A. Hawick and H.A. James☆19Apr 11, 2013Updated 13 years ago
- Fibertree emulator☆17Nov 4, 2024Updated last year
- A GPU-accelerated differentiable fluid simulator written in JAX.☆11Feb 1, 2021Updated 5 years ago
- [NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruning☆91Nov 29, 2025Updated 4 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- benchmark for linux server☆13Nov 6, 2016Updated 9 years ago
- SOTA Learning-augmented Systems☆37May 21, 2022Updated 3 years ago
- a tensor computing compiler based tile programming for gpu, cpu or tpu☆45Feb 2, 2026Updated 2 months ago
- A repository of Markedapp styles☆16Dec 30, 2015Updated 10 years ago
- Python Script to Open SJTU Dormitory Smart Lock☆10Sep 12, 2022Updated 3 years ago
- Shared Middle-Layer for Triton Compilation☆330Dec 5, 2025Updated 4 months ago
- CXL remote offloading data movement aware compiler☆73Mar 24, 2026Updated 3 weeks ago
- Memory footprint reduction for transformer models☆11Jan 24, 2023Updated 3 years ago
- A curated list for Efficient Large Language Models☆11Mar 25, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Personal notes about Fortran programming language☆13Jun 3, 2021Updated 4 years ago
- [ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation☆19Mar 6, 2025Updated last year
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Apr 9, 2026Updated last week
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- A Compiler from "Mx* language" (A C++ & Java like language) to RV32I Assembly, with optimizations on LLVM IR. SJTU CS2966 Project.☆13Feb 12, 2023Updated 3 years ago
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆16Updated this week
- Compare intel and kunpeng cpus, helping people know about Kunpeng, the ARM64 chip.☆14Nov 23, 2020Updated 5 years ago
- ☆12Sep 4, 2021Updated 4 years ago
- ☆15Jan 7, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Apr 9, 2026Updated last week
- ☆29Aug 4, 2025Updated 8 months ago
- Development repository for the Triton-Linalg conversion☆218Feb 7, 2025Updated last year
- (elastic) cuckoo hashing☆16Jun 20, 2020Updated 5 years ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Mar 23, 2025Updated last year
- Fast and memory-efficient exact attention☆20Apr 10, 2026Updated last week
- ☆10Dec 8, 2021Updated 4 years ago