PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR'24 Highlight]
☆180May 1, 2025Updated 10 months ago
Alternatives and similar repositories for nxtp
Users that are interested in nxtp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2024] WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation☆112Feb 6, 2025Updated last year
- Code of the paper "Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analys…☆27Dec 13, 2023Updated 2 years ago
- ☆35Jan 23, 2024Updated 2 years ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆100Jul 15, 2024Updated last year
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- (CVPR2023) CAPE: Camera View Position Embedding for Multi-View 3D Object Detection☆110May 5, 2023Updated 2 years ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆37Jan 3, 2024Updated 2 years ago
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated last year
- ☆27Aug 28, 2023Updated 2 years ago
- [ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"☆894Aug 13, 2024Updated last year
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation☆48Jul 18, 2024Updated last year
- [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…☆951Aug 5, 2025Updated 7 months ago
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆245Jan 17, 2024Updated 2 years ago
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating…☆137Mar 20, 2024Updated 2 years ago
- Official Pytorch Implementation of Self-emerging Token Labeling