video description generation vision-language model
☆21Jan 21, 2025Updated last year
Alternatives and similar repositories for SpaceTimeGPT
Users that are interested in SpaceTimeGPT are comparing it to the libraries listed below
Sorting:
- Contents for the Winter 2020 SlicerMorph workshop☆11Sep 14, 2020Updated 5 years ago
- ☆46May 24, 2025Updated 9 months ago
- ☆14Aug 28, 2024Updated last year
- GoTrack: Generic 6DoF Object Pose Refinement and Tracking, CV4MR 2025☆73Oct 14, 2025Updated 4 months ago
- ML from scratch in Jax☆12Aug 20, 2025Updated 6 months ago
- ☆12Jan 19, 2024Updated 2 years ago
- Agentic Keyframe Search for Video Question Answering☆16Apr 7, 2025Updated 11 months ago
- Phoshell: a Forth inspired, extremely lightweight, stack machine shell, implementable in _ALL_ known programming languages.☆10Nov 21, 2020Updated 5 years ago
- [WACV 2026] PyTorch code for 4D-Animal.☆28Nov 18, 2025Updated 3 months ago
- ☆11Oct 25, 2020Updated 5 years ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 7 months ago
- [EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…☆14Mar 4, 2025Updated last year
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- A discord bot to play with the KittyCAD Text to CAD API.☆15Feb 15, 2026Updated 3 weeks ago
- ☆13Jun 23, 2019Updated 6 years ago
- ☆12Apr 30, 2023Updated 2 years ago
- Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"☆15Oct 2, 2025Updated 5 months ago
- [ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval☆13Nov 5, 2023Updated 2 years ago
- Code for the paper: "Invertible CNN-Based Super Resolution with Downsampling Awareness" by Andrew Geiss and Joseph C. Hardin, Nov 2020☆12Nov 11, 2020Updated 5 years ago
- Official source code of HELM, a family of fully hyperbolic large language models☆31Feb 24, 2026Updated 2 weeks ago
- ☆42Sep 15, 2025Updated 5 months ago
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- ☆11Sep 28, 2023Updated 2 years ago
- Offical Pytorch Implementation of CVPR2025 GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose E…☆15Aug 9, 2025Updated 7 months ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆16Oct 31, 2024Updated last year
- Collaborative inference of latent diffusion via hivemind☆12May 29, 2023Updated 2 years ago
- SwayWM dotfiles☆10Jul 21, 2020Updated 5 years ago
- Fine-tuning, DPO, RLHF, RLAIF on LLMs - Qwen3, Zephyr 7B GPTQ with 4-Bit Quantization, Mistral-7B-GPTQ☆15Jul 5, 2025Updated 8 months ago
- ☆12Apr 6, 2021Updated 4 years ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 4 months ago
- calc= is a small concatenative programming language for doing math in chats.☆13Jun 25, 2022Updated 3 years ago
- Repository for the paper: "Birds of a Feather: Capturing Avian Shape Models from Images"☆20Dec 2, 2022Updated 3 years ago
- A curve-editor for Stable Diffusion prompt interpolation☆21Oct 3, 2022Updated 3 years ago
- Official repo for BWLer: Barycentric Weight Layer☆29Sep 26, 2025Updated 5 months ago
- A programming language based on bindings.☆12Jul 6, 2025Updated 8 months ago
- RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.☆13May 26, 2023Updated 2 years ago
- ☆14Mar 31, 2025Updated 11 months ago
- [ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts☆14Jan 13, 2025Updated last year
- scale a rig from a configuration file and animate it with trajectory IK☆18Feb 10, 2024Updated 2 years ago