cmeraki / vit.triton

VIT inference in triton because, why not?

☆16

Related projects: ⓘ

lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆233Updated 4 months ago
OscarXZQ / weight-selection
☆164Updated 8 months ago
VITA-Group / AsViT
[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…
☆76Updated 2 years ago
mit-han-lab / patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
☆73Updated 3 months ago
TomerRonen34 / mixed-resolution-vit
☆48Updated 11 months ago
kaiyuyue / nxtp
Object Recognition as Next Token Prediction (CVPR 2024)
☆153Updated 2 months ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆100Updated last month
karttikeya / minREV
A simple minimal implementation of Reversible Vision Transformers
☆114Updated 6 months ago
naver-ai / rope-vit
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
☆157Updated last month
NVIDIA / Megatron-Energon
Megatron's multi-modal data loader
☆42Updated this week
graphcore-research / unit-scaling
A library for unit scaling in PyTorch
☆94Updated 2 weeks ago
bronyayang / Law_of_Vision_Representation_in_MLLMs
Official implementation of the Law of Vision Representation in MLLMs
☆93Updated last week
LAION-AI / General-GPT
☆64Updated 11 months ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆46Updated 3 weeks ago
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆115Updated last month
facebookresearch / maws
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
☆75Updated last month
ziplab / EcoFormer
[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"
☆66Updated last year
KellerJordan / cifar10-airbench
94% on CIFAR-10 in 3.09 seconds 💨 96% in 27 seconds
☆127Updated last month
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated last year
FocalNet / Networks-Beyond-Attention
A compilation of network architectures for vision and others without usage of self-attention mechanism
☆77Updated last year
sramshetty / mixture-of-depths
An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆33Updated 3 months ago
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆104Updated 6 months ago
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆56Updated this week
LeapLabTHU / Deep-Incubation
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
☆90Updated last year
Arnav0400 / ViT-Slim
Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”
☆243Updated 11 months ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆101Updated last year
alenic / timm-models-explorer
Timm model explorer
☆36Updated 5 months ago
TRI-ML / vlm-evaluation
VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning
☆77Updated last week
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆85Updated last month
Qualcomm-AI-research / llm-surgeon
☆21Updated 3 months ago