q-viper / image-bakerLinks
Let's bake an image.
β15Updated last week
Alternatives and similar repositories for image-baker
Users that are interested in image-baker are comparing it to the libraries listed below
Sorting:
- Fine tune Gemma 3 on an object detection taskβ89Updated 4 months ago
- Inference and fine-tuning examples for vision models from π€ Transformersβ162Updated 3 months ago
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.β92Updated last week
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β84Updated last year
- Build Agentic workflows with function calling using open LLMsβ28Updated 3 weeks ago
- Notebooks for fine tuning pali gemmaβ117Updated 7 months ago
- Take your LLM to the optometrist.β42Updated this week
- Create topological graph for image segments.β22Updated last year
- Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision modelsβ130Updated 2 months ago
- code for training and using chess embeddings modelsβ13Updated last year
- Official code for PEEKABOO2: Adapting Peekaboo with Segment Anything Model for Unsupervised Object Localization in Images and Videos.β29Updated 3 weeks ago
- Join 15k builders to the Real-World ML Newsletter β¬οΈβ¬οΈβ¬οΈβ47Updated last year
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.β33Updated last year
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ274Updated 4 months ago
- Eye explorationβ29Updated last week
- Practical Python exercises on classical computer vision and clean engineering practicesβ22Updated 7 months ago
- Luth is a state-of-the-art series of fine-tuned LLMs for Frenchβ39Updated last month
- Ultralytics Notebooks πβ147Updated last week
- Using the moondream VLM with optical flow for promptable object trackingβ71Updated 9 months ago
- Train LLM on Hugging Face infraβ67Updated 2 weeks ago
- A complete PyTorch implementation of Google's Gemma3 270M language model, featuring sliding window attention, RoPE positional encoding, aβ¦β42Updated 2 months ago
- Tools for merging pretrained large language models.β19Updated last year
- A collection of hand on notebook for LLMs practitionerβ50Updated 10 months ago
- An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.β29Updated 9 months ago
- From scratch implementation of a vision language model in pure PyTorchβ251Updated last year
- Table detection with Florence.β15Updated last year
- Experiment and integrate with different OCR frameworks seamlesslyβ103Updated last year
- β23Updated 10 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vβ¦β125Updated 5 months ago
- Seemless interface of using PyTOrch distributed with Jupyter notebooksβ56Updated 2 months ago