FPSG-UIUC / micro24-fusemax-artifactView external linksLinks
MICRO 2024 Evaluation Artifact for FuseMax
☆16Aug 26, 2024Updated last year
Alternatives and similar repositories for micro24-fusemax-artifact
Users that are interested in micro24-fusemax-artifact are comparing it to the libraries listed below
Sorting:
- ☆16Mar 8, 2025Updated 11 months ago
- MICRO 2023 Evaluation Artifact for TeAAL☆10Oct 26, 2023Updated 2 years ago
- Open source RTL implementation of Tensor Core, Sparse Tensor Core, BitWave and SparSynergy in the article: "SparSynergy: Unlocking Flexib…☆22Mar 29, 2025Updated 10 months ago
- ☆13May 8, 2025Updated 9 months ago
- Artifact for "DX100: A Programmable Data Access Accelerator for Indirection (ISCA 2025)" paper☆16Nov 6, 2025Updated 3 months ago
- ☆17Oct 7, 2025Updated 4 months ago
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Jun 24, 2025Updated 7 months ago
- ☆17Mar 26, 2025Updated 10 months ago
- Implementation of Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning in Chisel HDL. To know more, …☆17Oct 9, 2021Updated 4 years ago
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆25Jul 4, 2024Updated last year
- ☆19Jan 2, 2026Updated last month
- [FPL'24] This repository contains the source code for the paper “Revealing Untapped DSP Optimization Potentials for FPGA-based Systolic M…☆21May 6, 2024Updated last year
- the GPU implementation of bucket based farthest point sampling, achieves 3-4x speedup than the conventional implementation☆21Aug 16, 2023Updated 2 years ago
- StateMover is a checkpoint-based debugging framework for FPGAs.☆21Jul 14, 2022Updated 3 years ago
- All the tools you need to reproduce the CellIFT paper experiments☆23Feb 11, 2025Updated last year
- This is the open-source version of TinyTS. The code is dirty so far. We may clean the code in the future.☆19Aug 11, 2025Updated 6 months ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 7 months ago
- the CPU implementation of bucket based farthest point sampling, achieves 7-81x speedup than the conventional implementation☆26Sep 17, 2023Updated 2 years ago
- LLM Inference with Microscaling Format☆34Nov 12, 2024Updated last year
- mNPUsim: A Cycle-accurate Multi-core NPU Simulator (IISWC 2023)☆71Dec 29, 2025Updated last month
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆13Apr 29, 2025Updated 9 months ago
- PyTorchSim is a Comprehensive, Fast, and Accurate NPU Simulation Framework☆90Updated this week
- HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs☆39Dec 9, 2024Updated last year
- Luthier, a GPU binary instrumentation tool for AMD GPUs☆26Updated this week
- ☆140Jul 19, 2025Updated 6 months ago
- H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference☆87Apr 26, 2025Updated 9 months ago
- [ICLR 2026] FastCar☆16May 22, 2025Updated 8 months ago
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- ☆10Apr 24, 2024Updated last year
- Error-free transformations are used to get results with extra accuracy.☆15Jan 20, 2025Updated last year
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆25Jun 16, 2025Updated 8 months ago
- Accelerator Zoo☆20Oct 14, 2025Updated 4 months ago
- ☆78Aug 29, 2025Updated 5 months ago
- Boosted E-Graph Extraction with Adaptive Heuristics and Exact Solving☆25Jan 7, 2026Updated last month
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Apr 14, 2023Updated 2 years ago
- fork of file_parda from bitbucket☆11Jun 27, 2015Updated 10 years ago
- ☆18Jan 30, 2026Updated 2 weeks ago
- ☆17Dec 16, 2025Updated last month
- Generative Models for Low Rank Video Representation and Reconstruction☆10May 20, 2019Updated 6 years ago