NSTiwari / PaliGemma-Android-HFLinks

This repository is an implementation of inferring the PaliGemma Vision Language Model on Android using Hugging Face-Gradio Client API for tasks such as zero-shot object detection, image captioning and visual question-answering.

☆19

Alternatives and similar repositories for PaliGemma-Android-HF

Users that are interested in PaliGemma-Android-HF are comparing it to the libraries listed below

Sorting:

adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆82Updated last year
camenduru / MoE-LLaVA-jupyter
☆16Updated last year
deep-diver / Vid2Persona
This project breathes life into video characters by using AI to describe their personality and then chat with you as them.
☆47Updated last year
camenduru / ShareGPT4V-colab
☆31Updated last year
Aesthisia / LLMinator
Gradio based tool to run opensource LLM models directly from Huggingface
☆94Updated last year
camenduru / MiniGPT-v2-colab
☆29Updated last year
bdambrosio / AllTheWorldAPlay
All the world is a play, we are but actors in it.
☆50Updated 2 weeks ago
martintomov / comfy-anything
Community ComfyUI workflows running on fal.ai
☆58Updated 11 months ago
gradio-app / sambanova-gradio
☆21Updated 9 months ago
JakeFurtaw / Chat-RAG
Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…
☆22Updated 3 months ago
mounta11n / plusplus-camall
After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…
☆54Updated 11 months ago
AK391 / dailypapersHN
☆86Updated 10 months ago
shubham0204 / Segment-Anything-Android
An Android app running inference on Meta's Segment-Anything (SAM) and SAM v2
☆43Updated 6 months ago
QuixiAI / generate
☆28Updated last year
tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆66Updated 11 months ago
camenduru / autocaption-colab
☆19Updated last year
ritabratamaiti / AnyModal
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
☆101Updated 7 months ago
cocktailpeanut / hallucinator
☆51Updated 9 months ago
camenduru / Multi-LoRA-Composition-jupyter
☆13Updated last year
diicellman / dynamite-dogs
BH hackathon
☆14Updated last year
Doriandarko / Moondream2-streamlit
☆80Updated last year
AI-ANK / c3-python-nostream
Python Server for C3 AI app. A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) with…
☆24Updated last year
AIAnytime / Small-Multimodal-Vision-Model
Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.
☆17Updated last year
camenduru / Depth-Anything-jupyter
☆11Updated last year
mzbac / mlx-chat-ui
huggingface chat-ui integration with mlx-lm server
☆60Updated last year
gkamradt / FineTuningClone
☆37Updated last year
fofr / animate
☆12Updated last year
teknium1 / ShareGPT-Builder
☆116Updated 7 months ago
camenduru / sdxl-turbo-colab
☆79Updated last year
kyegomez / NeoSapiens
The next evolution of Agents
☆48Updated 2 weeks ago