JunHao-Zhu / FusionQuery
[VLDB 2024] Source code for FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data
☆10Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for FusionQuery
- Pytorch implementation of our paper SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration accepted by NeurIPS …☆22Updated 8 months ago
- ☆18Updated 2 years ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆53Updated 8 months ago
- [ICLR 2024] Dynamic Neural Response Tuning☆15Updated 2 weeks ago
- This project is the official implementation of our accepted IEEE TPAMI paper Diverse Sample Generation: Pushing the Limit of Data-free Qu…☆14Updated last year
- SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models☆24Updated 3 months ago
- ☆30Updated 9 months ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆83Updated 3 months ago
- ☆23Updated last year
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆13Updated 2 months ago
- ☆23Updated 4 months ago
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆23Updated last year
- ☆42Updated 6 months ago
- Stateful LLM Serving☆38Updated 3 months ago
- FGNN's artifact evaluation (EuroSys 2022)☆17Updated 2 years ago
- [ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Y…☆31Updated last year
- ☆46Updated 5 months ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆38Updated this week
- Official Implementation of "Genie: Show Me the Data for Quantization" (CVPR 2023)☆17Updated last year
- [IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…☆51Updated last year
- ☆12Updated 2 years ago
- MagicPIG: LSH Sampling for Efficient LLM Generation☆59Updated 3 weeks ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆181Updated last year
- Accelerating Recommender model training by leveraging popular choices -- VLDB 2022☆29Updated 2 months ago
- Code for ICML 2022 paper "SPDY: Accurate Pruning with Speedup Guarantees"☆18Updated last year
- FastFlow is a system that automatically detects CPU bottlenecks in deep learning training pipelines and resolves the bottlenecks with dat…☆25Updated last year
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny model can tell you the verbosity of an LLM (…☆22Updated 5 months ago
- ☆10Updated last year
- A Skew-Resistant Index for Processing-in-Memory☆24Updated last month
- LLM Serving Performance Evaluation Harness☆56Updated 2 months ago