camenduru / LLaVA-colabLinks
β223Updated last year
Alternatives and similar repositories for LLaVA-colab
Users that are interested in LLaVA-colab are comparing it to the libraries listed below
Sorting:
- Fine Tuning Multimodal LLM "Idefics 9B" on Pokemon Go Dataset available on Hugging Face.β19Updated last year
- Maybe the new state of the art vision model? we'll see π€·ββοΈβ166Updated last year
- InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.β101Updated last year
- β712Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β83Updated last year
- From scratch implementation of a vision language model in pure PyTorchβ235Updated last year
- β82Updated last year
- A real-time video caption to conversation bot that captures frames generates captions and creates conversational responses using a Large β¦β122Updated last year
- AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questionsβ248Updated 8 months ago
- Large Language Model (LLM) Inference API and Chatbotβ126Updated last year
- One click templates for inferencing Language Modelsβ211Updated 3 weeks ago
- Example code for extracting Q&A datasets from LLM'sβ82Updated 2 years ago
- ποΈ + π¬ + π§ = π€ Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]β630Updated last year
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.β186Updated last year
- Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.β58Updated last year
- Examples of RAG using Llamaindex with local LLMs - Gemma, Mixtral 8x7B, Llama 2, Mistral 7B, Orca 2, Phi-2, Neural 7Bβ129Updated last year
- Data extraction with LLM on CPUβ269Updated last year
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ276Updated last year
- webcamGPT - chat with video stream π¬ + πΈβ266Updated last year
- Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..β64Updated last year
- Use Grounding DINO, Segment Anything, and GPT-4V to label images with segmentation masks for use in training smaller, fine-tuned models.β66Updated last year
- VisualChatGPTβ129Updated 2 years ago
- llama.cpp with BakLLaVA model describes what does it seeβ382Updated last year
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.β168Updated last year
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skillsβ758Updated last year
- Fine-tuning LLMs using QLoRAβ262Updated last year
- Quick exploration into fine tuning florence 2β330Updated 11 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ποΈβ88Updated last year
- β54Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β37Updated last year