xinyanghuang7/Basic-Visual-Language-Model

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xinyanghuang7/Basic-Visual-Language-Model)

xinyanghuang7 / Basic-Visual-Language-Model

Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖

☆48

Alternatives and similar repositories for Basic-Visual-Language-Model

Users that are interested in Basic-Visual-Language-Model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WatchTower-Liu / VLM-learning
View on GitHub
Building a VLM model starts from the basic module.
☆18Apr 7, 2024Updated 2 years ago
bupt-ai-cz / ProML
View on GitHub
code for "Semi-supervised Domain Adaptation via Prototype-based Multi-level Learning"
☆15Dec 26, 2023Updated 2 years ago
honeyandme / knowledge
View on GitHub
构建一个医疗领域知识图谱和一个基于Flask的简易网页聊天机器人，通过ner获取用户问题的实体并在知识图谱内提取答案。
☆12Apr 25, 2023Updated 3 years ago
Sanster / VLM-demos
View on GitHub
Collect VLM models that can be tried online.
☆15Apr 15, 2024Updated 2 years ago
tpoisonooo / open-r1
View on GitHub
Fully open reproduction of DeepSeek-R1
☆11Mar 24, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
shijunbao / prompt-manager
View on GitHub
集中管理所有的prompt。
☆14Nov 27, 2024Updated last year
double125 / Graph-Matching-Attention
View on GitHub
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
☆11Feb 16, 2023Updated 3 years ago
JDing0521 / GraphOTTER
View on GitHub
☆21Dec 7, 2024Updated last year
TAMS-Group / tams_glass_reconstruction
View on GitHub
Detection and Reconstruction of Transparent Objects with Infrared Projection-based RGB-D Cameras
☆13Jan 17, 2021Updated 5 years ago
ClinicalDataScience / autoPETIII
View on GitHub
Official repository for the autoPET III challenge.
☆12Jan 8, 2026Updated 6 months ago
SWHL / TableRecognitionMetric
View on GitHub
Compute benchmark of table structure recognition.
☆31Dec 2, 2025Updated 7 months ago
zxc123cc / BankOCRMoE
View on GitHub
☆13May 28, 2025Updated last year
CanvaChen / chinese-llama-tokenizer
View on GitHub
目标：构建一个更符合语言学的小而美的 llama 分词器，支持中英日三国语言
☆19Jun 2, 2024Updated 2 years ago
AI-Study-Han / Mini-Llama2-Chinese
View on GitHub
想要从零开始训练一个中文的mini大语言模型，可以进行基本的对话，模型大小根据手头的机器决定
☆66Aug 14, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
poloclub / tsr-convstem
View on GitHub
High-Performance Transformers for Table Structure Recognition Need Early Convolutions
☆45Apr 21, 2026Updated 3 months ago
prnake / kimi-deepresearch
View on GitHub
Kimi K2 Thinking Agentic Search Unofficial Implementation
☆15Nov 9, 2025Updated 8 months ago
Freder-chen / ReasonGenRM
View on GitHub
A simple implementation of ReasonGenRM.
☆19Apr 21, 2025Updated last year
KomeijiForce / Active_Passive_Constraint_Koishiday_2024
View on GitHub
Koishi's Day 2024 Paper (NeurIPS 2024): An advanced persona-driven role-playing system with global faithfulness quantification and optimi…
☆13Oct 19, 2025Updated 9 months ago
CUHKWilliam / GeoManip-release
View on GitHub
☆12Apr 22, 2025Updated last year
kq-chen / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
☆15Feb 17, 2025Updated last year
rugvedmhatre / Multimodal-Sentiment-Analysis
View on GitHub
This project aims to develop a robust multi-modal sentiment analysis system that integrates visual cues from images with textual data to …
☆18May 14, 2024Updated 2 years ago
Paul33333 / SFT-and-DPO
View on GitHub
This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)
☆21Jan 9, 2025Updated last year
lucaspk512 / vrdone
View on GitHub
Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".
☆12Nov 13, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
roudimit / c2kd
View on GitHub
Code for the C2KD paper (ICASSP 2023)
☆20May 15, 2023Updated 3 years ago
mxw13579 / silly-tavern-docker-starts
View on GitHub
酒馆一键docker启动命令
☆17Jun 10, 2026Updated last month
BearCleverProud / MoME
View on GitHub
Repository for Mixture of Multimodal Experts
☆52Aug 3, 2024Updated last year
dteodore / EmotionArcs
View on GitHub
☆11Mar 26, 2024Updated 2 years ago
RobertCsordas / molecule_gen
View on GitHub
Implementation of "Learning Deep Generative Models"
☆12Jun 4, 2019Updated 7 years ago
kq-chen / qwen-vl-utils
View on GitHub
helper functions for processing and integrating visual language information with Qwen-VL Series Model
☆17Aug 30, 2024Updated last year
davanstrien / huggingface-tldr
View on GitHub
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
☆10Apr 4, 2024Updated 2 years ago
JJXiangJiaoJun / cutlass_gemv
View on GitHub
GEMV implementation with CUTLASS
☆21Aug 21, 2025Updated 11 months ago
HFAiLab / ffrecord_converters
View on GitHub
☆12Feb 16, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
DataXujing / DeepSeek-R1-Android
View on GitHub
安卓手机部署DeepSeek-R1 蒸馏的1.5B模型
☆24Feb 4, 2025Updated last year
safety-research / how-ai-impacts-skill-formation
View on GitHub
Repo for measuring whether using AI tools inhibits skill formation and development
☆15Jan 3, 2026Updated 6 months ago
jshtok / StarNet
View on GitHub
Pytorch implementation of the StarNet paper algorithm
☆10Jan 25, 2022Updated 4 years ago
bertiev / SimpleSafetyTests
View on GitHub
☆19Mar 25, 2024Updated 2 years ago
CelVoxes / thinkR
View on GitHub
gpt-o1 like chain of thoughts with local LLMs in R
☆31Oct 15, 2024Updated last year
wujinzhong / Wav2Lip_TensorRT
View on GitHub
☆29Oct 1, 2023Updated 2 years ago
joeyc408 / TennisPoint
View on GitHub
Tennis hawk-eye system based on monocular vision
☆15Oct 30, 2020Updated 5 years ago