AviSoori1x/seemore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AviSoori1x/seemore)

AviSoori1x / seemore

From scratch implementation of a vision language model in pure PyTorch

☆260

Alternatives and similar repositories for seemore

Users that are interested in seemore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

adithya-s-k / YoloGemma
View on GitHub
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆88May 29, 2024Updated 2 years ago
hkproj / triton-flash-attention
View on GitHub
☆257Jan 2, 2025Updated last year
MaLA-LM / GlotEval
View on GitHub
GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way
☆18Nov 4, 2025Updated 8 months ago
hkproj / mistral-src-commented
View on GitHub
Reference implementation of Mistral AI 7B v0.1 model.
☆28Dec 25, 2023Updated 2 years ago
MILVLG / imp
View on GitHub
a family of highly capabale yet efficient large multimodal models
☆194Aug 23, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UBC-NLP / peacock
View on GitHub
This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.
☆26Dec 9, 2024Updated last year
triple-mu / Stable-Diffusion-TensorRT
View on GitHub
Stable Diffusion in TensorRT 8.5+
☆15Mar 19, 2023Updated 3 years ago
AviSoori1x / makeMoE
View on GitHub
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
☆811Oct 30, 2024Updated last year
pierrel55 / llama_st
View on GitHub
Load and run Llama from safetensors files in C
☆15Oct 24, 2024Updated last year
hkproj / pytorch-paligemma
View on GitHub
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
☆625Dec 6, 2024Updated last year
oKatanaaa / lima-gui
View on GitHub
A simple GUI utility for gathering LIMA-like chat data.
☆23Oct 6, 2025Updated 9 months ago
merveenoyan / smol-vision
View on GitHub
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
☆1,966May 26, 2026Updated 2 months ago
JulienGenovese / JulienGenovese
View on GitHub
In this repository we have all the codes that we have developed
☆12Sep 13, 2023Updated 2 years ago
naklecha / llama3-from-scratch
View on GitHub
llama3 implementation one matrix multiplication at a time
☆15,224May 23, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
microsoft / Samba
View on GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆966Nov 16, 2025Updated 8 months ago
huggingface / nanoVLM
View on GitHub
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,972Oct 27, 2025Updated 9 months ago
m87-labs / moondream
View on GitHub
tiny vision language model
☆9,891Apr 20, 2026Updated 3 months ago
kadirnar / diffusersplus
View on GitHub
This project is under development.
☆23Aug 20, 2023Updated 2 years ago
facok / florence2-ft-simple
View on GitHub
finetune your florence2 model easy
☆18Jul 8, 2024Updated 2 years ago
XiaoduoAILab / XmodelVLM
View on GitHub
☆68Jun 20, 2024Updated 2 years ago
Respaired / RiFornet_Vocoder
View on GitHub
a Neural Vocoder supporting Ring Attention, Conformer and NSF.
☆25Aug 1, 2025Updated 11 months ago
gokayfem / awesome-vlm-architectures
View on GitHub
Famous Vision Language Models and Their Architectures
☆1,286Jan 11, 2026Updated 6 months ago
SRSWTI / axis
View on GitHub
AI eXplainable Inference & Search. Open Sourcing on-premise, ultra-fast latency intelligence to all.
☆37Feb 28, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nikhilroxtomar / Brain-Tumor-Segmentation-using-UNETR-in-TensorFlow
View on GitHub
This repository demonstrates the utilization of UNETR for brain tumor segmentation.
☆11Feb 23, 2024Updated 2 years ago
WatchTower-Liu / VLM-learning
View on GitHub
Building a VLM model starts from the basic module.
☆18Apr 7, 2024Updated 2 years ago
cg123 / bitnet
View on GitHub
Modeling code for a BitNet b1.58 Llama-style model.
☆25Apr 30, 2024Updated 2 years ago
JINO-ROHIT / ml-math-in-depth
View on GitHub
☆15Jul 25, 2025Updated last year
MekkCyber / TritonAcademy
View on GitHub
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆208Jun 1, 2025Updated last year
imagegridworth / IG-VLM
View on GitHub
☆138Sep 29, 2024Updated last year
NielsRogge / Transformers-Tutorials
View on GitHub
This repository contains demos I made with the Transformers library by HuggingFace.
☆11,685Apr 20, 2026Updated 3 months ago
andimarafioti / florence2-finetuning
View on GitHub
Quick exploration into fine tuning florence 2
☆340Sep 19, 2024Updated last year
nivibilla / build-nanogpt
View on GitHub
Video+code lecture on building nanoGPT from scratch
☆65Jun 14, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wenlai-lavine / jola
View on GitHub
Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning
☆28Jun 18, 2025Updated last year
TinyLLaVA / TinyLLaVA_Factory
View on GitHub
A Framework of Small-scale Large Multimodal Models
☆995Updated this week
EswarDivi / NarrateIt
View on GitHub
https://narrateit.streamlit.app/
☆39Jan 2, 2025Updated last year
mi92 / reverse-image-rag
View on GitHub
☆15Jul 8, 2024Updated 2 years ago
mbzuai-oryx / Video-LLaVA
View on GitHub
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
☆264Aug 5, 2025Updated 11 months ago
GasolSun36 / SURf
View on GitHub
[EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information
☆11Oct 11, 2024Updated last year
linkedin / Liger-Kernel
View on GitHub
Efficient Triton Kernels for LLM Training
☆6,539Updated this week