luogen1996/LWTransformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/luogen1996/LWTransformer)

luogen1996 / LWTransformer

Lightweight Transformer for Multi-modal Tasks

☆16

Alternatives and similar repositories for LWTransformer

Users that are interested in LWTransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yuxiaodongHRI / SOIT
View on GitHub
SOIT: Segmenting Objects with Instance-Aware Transformers
☆14Jun 6, 2022Updated 4 years ago
YouHuang67 / mamba-code-explained
View on GitHub
☆19Jan 7, 2026Updated 6 months ago
UCDvision / sima
View on GitHub
Official implementation for "SimA: Simple Softmax-free Attention for Vision Transformers"
☆48Apr 18, 2024Updated 2 years ago
luo3300612 / Semantics-AssistedVideoCaptioning.pytorch
View on GitHub
pytorch implementation of Semantics-AssistedVideoCaptioning
☆11Feb 16, 2023Updated 3 years ago
luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
lucasjinreal / wnnx_models
View on GitHub
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Jun 22, 2022Updated 4 years ago
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Updated this week
szx503045266 / ASF-former
View on GitHub
Adaptive Split-Fusion Transformer (ICME 2023 Oral)
☆19Feb 19, 2024Updated 2 years ago
LeeYN-43 / Clover
View on GitHub
Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)
☆40Feb 15, 2023Updated 3 years ago
yilunliao / vit-search
View on GitHub
Code for "Searching for Efficient Multi-Stage Vision Transformers"
☆63Sep 1, 2021Updated 4 years ago
jxr326 / SwinMCNet
View on GitHub
☆16Jul 20, 2022Updated 4 years ago
qhfan / FAT
View on GitHub
[NeurIPS2023]Lightweight Vision Transformer with Bidirectional Interaction
☆27Oct 27, 2023Updated 2 years ago
lufanma / IFR
View on GitHub
Implementation of the paper ''Implicit Feature Refinement for Instance Segmentation''.
☆20Oct 27, 2021Updated 4 years ago
liuzywen / HRTransNet
View on GitHub
☆21Feb 3, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
qumengxue / RIO
View on GitHub
☆13Oct 30, 2023Updated 2 years ago
jin-s13 / MMPD-Dataset
View on GitHub
MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"
☆22Jul 15, 2024Updated 2 years ago
ubc-vision / RefTR
View on GitHub
Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021
☆67May 26, 2022Updated 4 years ago
JackWhite-rwx / SceneGraphGenZeroShotWithGSAM
View on GitHub
Scene Graph Generate Zero Shot
☆23Apr 16, 2023Updated 3 years ago
WeihuangLin / INF-LLaVA
View on GitHub
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
☆42Aug 4, 2024Updated last year
duranze / Automatic-spectral-calibration-of-HSI
View on GitHub
☆22Jun 27, 2025Updated last year
YouHuang67 / High-Resolution-Segment-Anything
View on GitHub
☆34Jul 4, 2024Updated 2 years ago
YijinHuang / FPT
View on GitHub
[TNNLS'25] [MICCAI'24] A Parameter and Memory Efficient Transfer Learning Method
☆35Oct 29, 2025Updated 8 months ago
The-AI-Summer / simclr
View on GitHub
An education step by step implementation of SimCLR that accompanies the blogpost
☆31Mar 31, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
xmu-xiaoma666 / SDATR
View on GitHub
Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)
☆19Oct 15, 2022Updated 3 years ago
cocoshe / I2EBench
View on GitHub
[NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
☆35Dec 9, 2025Updated 7 months ago
irishev / DSP
View on GitHub
PyTorch implementation of "Dynamic Structure Pruning for Compressing CNNs" (AAAI 2023 Oral)
☆28Jan 15, 2024Updated 2 years ago
lingeringlight / SETA
View on GitHub
The official implementation for SETA (TIP 2024).
☆12Feb 17, 2025Updated last year
xuchenhao001 / HSViT
View on GitHub
HSViT: Horizontally Scalable Vision Transformer
☆13Nov 6, 2024Updated last year
bigD233 / AMFD
View on GitHub
☆30Mar 27, 2025Updated last year
KangOxford / Fourier-Transformer
View on GitHub
Transformer and Neural Operator for solving Stochastic PDE
☆12May 22, 2022Updated 4 years ago
daicoolb / Awesome-Video-Captioning
View on GitHub
video captioning
☆24Mar 14, 2019Updated 7 years ago
zlccccc / 3DVL_Codebase
View on GitHub
[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
☆57Jan 29, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
evanmiltenburg / MeasureDiversity
View on GitHub
Measure the diversity of image descriptions, repository for our COLING 2018 paper.
☆13Dec 29, 2019Updated 6 years ago
GeWu-Lab / InfoReg_CVPR2025
View on GitHub
This is the repo for "Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition", CVPR2025.
☆24Dec 22, 2025Updated 6 months ago
dangnh0611 / facial_verification_android
View on GitHub
DOneLogin Android: Facial verification for Two-Factors Authentication (2FA) on Android platform
☆11Mar 30, 2021Updated 5 years ago
tobna / TaylorShift
View on GitHub
This repository contains the code for the paper "TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back)…
☆15Feb 25, 2026Updated 4 months ago
LinZhekai / X-Oscar
View on GitHub
About Official repository for "X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation"
☆50Jun 25, 2024Updated 2 years ago
gqa-ood / GQA-OOD
View on GitHub
GQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.
☆33Mar 1, 2021Updated 5 years ago
edhuang1 / busy-quiet-net
View on GitHub
[Codes of paper]: Busy-Quiet Video Disentangling for Video Classification
☆14Jan 17, 2022Updated 4 years ago