Manually implemented quantization-aware training
☆23Oct 12, 2022Updated 3 years ago
Alternatives and similar repositories for qat
Users that are interested in qat are comparing it to the libraries listed below
Sorting:
- 阿里云第二届数据库大赛新手门槛队(季军)解决方案☆10Apr 19, 2021Updated 4 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated last year
- 如何做技术演讲(how to give a talk)的slide☆22Feb 8, 2021Updated 5 years ago
- FPGA-based HyperLogLog Accelerator☆12Jul 13, 2020Updated 5 years ago
- ☆11Apr 3, 2023Updated 2 years ago
- ☆11Jun 29, 2021Updated 4 years ago
- A minimum demo for PyTorch distributed extension functionality for collectives.☆15Jul 29, 2024Updated last year
- CUDA project for uni subject☆26Oct 26, 2020Updated 5 years ago
- Audio samples of our paper "PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network" (accepted by ICASSP2020).☆11Apr 14, 2020Updated 5 years ago
- A Learnable LSH Framework for Efficient NN Training☆34Jul 22, 2021Updated 4 years ago
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆17Sep 8, 2022Updated 3 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆14Nov 23, 2024Updated last year
- This is an official GitHub repository for the paper, "Towards timeout-less transport in commodity datacenter networks.".☆16Oct 12, 2021Updated 4 years ago
- ☆17Jul 24, 2023Updated 2 years ago
- SmartNIC☆14Dec 13, 2018Updated 7 years ago
- ☆18Oct 15, 2020Updated 5 years ago
- Johnson-Lindenstrauss transform (JLT), random projections (RP), fast Johnson-Lindenstrauss transform (FJLT), and randomized Hadamard tran…☆23Jul 11, 2023Updated 2 years ago
- Pytorch ImageNet training codes with various tricks, lr schedulers, distributed training, mixed precision training, DALI dataloader etc.☆18Aug 12, 2020Updated 5 years ago
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 2 years ago
- ☆20Sep 28, 2024Updated last year
- ☆20Jun 3, 2023Updated 2 years ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- Network Traffic Transformer to learn network dynamics from packet traces. Learn fundamental dynamics with pre-training and fine-tune to m…☆23Jan 17, 2024Updated 2 years ago
- Product Quantization k-Nearest Neighbors☆21Jun 24, 2021Updated 4 years ago
- ☆24May 6, 2022Updated 3 years ago
- To deploy Transformer models in CV to mobile devices.☆18Jan 20, 2022Updated 4 years ago
- An FPGA integration and acceleration of the popular FAISS framework for approximate similarity search☆25Jul 20, 2019Updated 6 years ago
- benchmarking some transformer deployments☆26Dec 15, 2025Updated 2 months ago
- A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralz…☆25Dec 20, 2022Updated 3 years ago
- ☆27Mar 2, 2023Updated 3 years ago
- A toy Python DL training library with PyTorch like API☆38Sep 23, 2025Updated 5 months ago
- 2021 Guangdong tile detection competition solution code☆57Aug 3, 2021Updated 4 years ago
- Fast matrix multiplication for few-bit integer matrices on CPUs.☆28Mar 19, 2019Updated 6 years ago
- This is a demo how to write a high performance convolution run on apple silicon☆57Feb 8, 2022Updated 4 years ago
- [ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"☆32Jul 24, 2022Updated 3 years ago
- Artifact evaluation repo for EuroSys'24.☆29Nov 7, 2023Updated 2 years ago