Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.
☆17Feb 5, 2024Updated 2 years ago
Alternatives and similar repositories for Small-Multimodal-Vision-Model
Users that are interested in Small-Multimodal-Vision-Model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Live audio chats with AI using Groq Llama3-70b and Deepgram Voice☆32Apr 24, 2024Updated 2 years ago
- PaliGemma Inference and Fine Tuning☆13May 15, 2024Updated 2 years ago
- LLM Applications built using Streamlit, LangChain, and OpenAI API☆11Oct 7, 2023Updated 2 years ago
- A simple demo application showcasing the power of Gemini 1.5 Pro's video understanding capabilities.☆31May 24, 2024Updated 2 years ago
- Simple Chainlit UI for running llms from Groq and LangChain☆17Feb 28, 2024Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Agent with vision ability via llava & autogen☆75Oct 16, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆21Nov 3, 2025Updated 7 months ago
- Make tool-calling schemas for existing tools☆14Mar 8, 2025Updated last year
- ☆17Oct 4, 2025Updated 8 months ago
- ☆18Jun 11, 2024Updated last year
- [ICLR 2025] Large (Vision) Language Models are Unsupervised In-Context Learners☆22Jun 6, 2025Updated last year
- ☆25Aug 10, 2024Updated last year
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆25Dec 2, 2025Updated 6 months ago
- Ask anything☆24May 16, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Democratizing Function Calling Capabilities for Open-Source Language Models☆43May 5, 2024Updated 2 years ago
- ☆14Nov 22, 2024Updated last year
- A proxy for Google Bard LLM☆10Nov 2, 2023Updated 2 years ago
- a family of highly capabale yet efficient large multimodal models☆193Aug 23, 2024Updated last year
- ☆15Mar 6, 2026Updated 3 months ago
- Prompt Framework made to optimise conversation with LLM's.☆12May 30, 2023Updated 3 years ago
- ☆18Mar 26, 2022Updated 4 years ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- A guide to structured generation using constrained decoding☆18Jun 9, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The browser extension of SydneyQt that enables multiple shortcuts, including resolve CAPTCHA automatically etc.☆10Jan 27, 2024Updated 2 years ago
- The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]☆53Mar 13, 2025Updated last year
- A simple dify bot☆34Apr 16, 2025Updated last year
- Retrieval Augmented Generation, but no servers involved. Backed by S3☆12Nov 3, 2023Updated 2 years ago
- ⚡ Building applications with LLMs through composability ⚡☆14Mar 10, 2023Updated 3 years ago
- Local character AI chatbot with chroma vector store memory and some scripts to process documents for Chroma☆35Oct 7, 2024Updated last year
- Gemini Bot is a Telegram chatbot powered by Vertex AI's generative models. This Python implementation utilizes the Telethon library to in…☆12Feb 17, 2024Updated 2 years ago
- A drag-and-drop-enabled, responsive, envelope graph that allows to shape a wave with attack, decay, sustain and release☆11Jan 5, 2023Updated 3 years ago
- rUv-Engineer - let's you describe UI using your imagination, then see it rendered live.☆13Sep 28, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Prompt Engineering for Large Language Models - Notebooks, Demos, Exercises, and Projects☆24Sep 14, 2023Updated 2 years ago
- ☆88Mar 7, 2024Updated 2 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- A comfyui costume node by BillBum for using api gen (VLM LLM T2I API Tools)☆10May 26, 2026Updated 2 weeks ago
- ☆12Jun 11, 2024Updated last year
- Rivet plugin for integration with Ollama, the tool for running LLMs locally easily☆43Jun 5, 2025Updated last year
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago