ANTONIOPSD / CaptionIMGLinks
Simple program to manually caption your images (or any other file types) so you can use them for AI training
☆36Updated 2 years ago
Alternatives and similar repositories for CaptionIMG
Users that are interested in CaptionIMG are comparing it to the libraries listed below
Sorting:
- Cerule - A Tiny Mighty Vision Model☆68Updated 3 months ago
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…☆19Updated last year
- ☆63Updated last year
- ☆69Updated last year
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆66Updated last year
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆53Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆85Updated last year
- (AAAI'25) Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆60Updated last year
- ☆30Updated last year
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Updated last year
- This repository holds the "Fully automated landmarking and facial segmentation on 3D photographs" files☆30Updated 2 years ago
- ☆52Updated 2 years ago
- Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦☆62Updated 2 years ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆88Updated 2 years ago
- ☆27Updated last year
- ☆15Updated 2 years ago
- Simplex Random Feature attention, in PyTorch☆75Updated 2 years ago
- This repository contains a framework for converting monocular videos into side-by-side (SBS) 3D videos. It utilizes a combination of imag…☆90Updated last year
- Gradio UI for a Cog API☆70Updated last year
- Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments☆47Updated 11 months ago
- Image Generation API Server - Similar to https://text-generator.io but for images☆51Updated 5 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated last year
- [NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Underst…☆23Updated 10 months ago
- Full finetuning of large language models without large memory requirements☆94Updated 4 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated 2 years ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆125Updated 6 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆24Updated last year
- ☆137Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated last year