roboflow / cog-vlm-clientLinks
Simple CogVLM client script
β13Updated last year
Alternatives and similar repositories for cog-vlm-client
Users that are interested in cog-vlm-client are comparing it to the libraries listed below
Sorting:
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ποΈβ87Updated 2 years ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the creβ¦β23Updated last year
- β14Updated last year
- β11Updated 2 years ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, modelβ¦β37Updated 2 years ago
- β29Updated last year
- BH hackathonβ13Updated last year
- β17Updated last year
- GPT-4V(ision) module for use with Autodistill.β25Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β84Updated last year
- Cerule - A Tiny Mighty Vision Modelβ67Updated last week
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.β67Updated last year
- EdgeSAM model for use with Autodistill.β29Updated last year
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).β45Updated last year
- Gradio UI for a Cog APIβ69Updated last year
- β20Updated last year
- GRDN.AI app for garden optimizationβ70Updated last year
- Use Grounding DINO, Segment Anything, and GPT-4V to label images with segmentation masks for use in training smaller, fine-tuned models.β65Updated last year
- Implementation of the premier Text to Video model from OpenAIβ55Updated last year
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.β48Updated last year
- Finetune any model on HF in less than 30 secondsβ55Updated 3 weeks ago
- Brainwave is a state-of-the-art neural decoder that transforms electroencephalogram (EEG) and brain signals into multimodal outputs incluβ¦β12Updated last month
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, anβ¦β16Updated last month
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zetaβ16Updated last year
- β47Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.β45Updated 2 years ago
- Cog wrapper for Vchitect/SEINEβ37Updated last year
- Using multiple LLMs for ensemble Forecastingβ16Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open dataβ21Updated last year
- Integrate an LLM copilot within your Keras model development workflowβ28Updated 2 years ago