fuvty / DeSCo
The official implementation of WSDM'24 paper <DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting>
☆15Updated 6 months ago
Related projects: ⓘ
- The official code for DATE'23 paper <CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory>☆20Updated last month
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆95Updated last month
- ☆96Updated 3 years ago
- Code Repository of Evaluating Quantized Large Language Models☆89Updated last week
- SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models☆16Updated last month
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆23Updated last year
- [Mlsys'22] Understanding gnn computational graph: A coordinated computation, io, and memory perspective☆17Updated last year
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆58Updated last year
- [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node …☆51Updated 11 months ago
- ☆15Updated last year
- Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models☆27Updated last week
- [HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning☆64Updated 3 weeks ago
- [ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Y…☆27Updated last year
- ☆16Updated 2 years ago
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…☆46Updated last year
- Official implementation for the paper "Understanding Hyperdimensional Computing for Parallel Single-Pass Learning"☆14Updated last year
- ICLR 2021☆42Updated 3 years ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆34Updated 6 months ago
- ☆39Updated last year
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆44Updated 11 months ago
- 16-fold memory access reduction with nearly no loss☆35Updated last month
- Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference☆21Updated 3 months ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 2 years ago
- [HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design☆32Updated 2 years ago
- ☆127Updated last month
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆23Updated 3 weeks ago
- ☆71Updated 3 years ago
- ☆75Updated 10 months ago
- The official implementation of the DAC 2024 paper GQA-LUT☆10Updated last week
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆72Updated last month