NVlabs / EfficientDLLinks
☆33Updated 3 months ago
Alternatives and similar repositories for EfficientDL
Users that are interested in EfficientDL are comparing it to the libraries listed below
Sorting:
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆59Updated 2 years ago
- Dynamic Neural Architecture Search Toolkit☆30Updated 9 months ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated 2 years ago
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆74Updated 2 years ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated last year
- ☆15Updated 4 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated 2 years ago
- An object detection codebase based on MegEngine.☆28Updated 2 years ago
- ViT trained on COYO-Labeled-300M dataset☆32Updated 2 years ago
- Patch convolution to avoid large GPU memory usage of Conv2D☆92Updated 7 months ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Updated 3 years ago
- [ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …☆72Updated 3 years ago
- Page for the CVPR 2023 Tutorial - Efficient Neural Networks: From Algorithm Design to Practical Mobile Deployments☆12Updated 2 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆103Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated last year
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Updated 2 months ago
- A block oriented training approach for inference time optimization.☆34Updated last year
- VIT inference in triton because, why not?☆31Updated last year
- Timm model explorer☆41Updated last year
- Quantize transformers to any learned arbitrary 4-bit numeric format☆45Updated 2 months ago
- Official PyTorch implementation of LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification☆47Updated 3 years ago
- CUDA and Triton implementations of Flash Attention with SoftmaxN.☆73Updated last year
- ☆22Updated 2 years ago
- ☆32Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!☆41Updated 2 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated last year
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated 2 years ago
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…☆37Updated last week
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆21Updated 3 years ago