subramen / minGPT-ddp
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆19Updated 2 years ago
Alternatives and similar repositories for minGPT-ddp:
Users that are interested in minGPT-ddp are comparing it to the libraries listed below
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆100Updated 4 months ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated last year
- ☆52Updated last week
- Recent Advances on Efficient Vision Transformers☆49Updated 2 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Updated 2 years ago
- ☆31Updated 7 months ago
- Megatron's multi-modal data loader☆157Updated this week
- VIT inference in triton because, why not?☆22Updated 7 months ago
- Implementation of Infini-Transformer in Pytorch☆107Updated 2 weeks ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆22Updated 10 months ago
- ☆171Updated 3 months ago
- Transformers w/o Attention, based fully on MLPs☆91Updated 9 months ago
- Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!☆40Updated last year
- ☆49Updated last year
- (CVPR 2022) Automated Progressive Learning for Efficient Training of Vision Transformers☆25Updated 2 years ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆40Updated last year
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆69Updated 2 years ago
- [CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer☆61Updated 8 months ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆60Updated last year
- Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations☆22Updated 3 months ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆46Updated 6 months ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆50Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆89Updated last month
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆19Updated 2 years ago
- ☆25Updated last year
- ☆29Updated last year
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆22Updated 7 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆35Updated 10 months ago
- Official implementation of "Active Image Indexing"☆59Updated last year
- Patch convolution to avoid large GPU memory usage of Conv2D☆82Updated 7 months ago