nssharmaofficial / image-caption-generatorView external linksLinks
Image captioning model with Resnet50 encoder and LSTM decoder
☆18Sep 6, 2024Updated last year
Alternatives and similar repositories for image-caption-generator
Users that are interested in image-caption-generator are comparing it to the libraries listed below
Sorting:
- 东南大学计软智(主要是东南大学人工智能学院)-部分课程项目与实验 Part of the course experiment in SEU AI. Include: Computer Graphics - Knowledge Engineering - Network a…☆16Jul 3, 2024Updated last year
- ☆115Nov 2, 2023Updated 2 years ago
- Paint by Example: Exemplar-based Image Editing with Diffusion Models☆1,247Nov 28, 2023Updated 2 years ago
- Codes for ID-Specific Video Customized Diffusion☆462Feb 22, 2024Updated last year
- Official Pytorch Implementation of DenseDiffusion (ICCV 2023)☆500Nov 14, 2023Updated 2 years ago
- Vision Transformer (ViT) in PyTorch☆845Mar 2, 2022Updated 3 years ago
- Build your own generative UI chatbot using the Vercel AI SDK and Google Gemini☆1,289Dec 6, 2025Updated 2 months ago
- An AI personal tutor built with Llama 3.1☆1,957Dec 15, 2025Updated last month
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,555Apr 24, 2024Updated last year
- [NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat☆4,380Oct 16, 2025Updated 3 months ago
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image☆32,562Jul 23, 2024Updated last year
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆4,066Apr 24, 2024Updated last year
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆4,416Feb 6, 2026Updated last week
- Flickr-Faces-HQ Dataset (FFHQ)☆4,099Nov 18, 2022Updated 3 years ago
- Turn any webpage into structured data using LLMs☆6,183Updated this week
- 🚀🎉📚 SaaS Boilerplate built with Next.js + Tailwind CSS + Shadcn UI + TypeScript. ⚡️ Full-stack React application with Auth, Multi-tena…☆6,810Updated this week
- PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn …☆7,170Mar 3, 2025Updated 11 months ago
- Inpaint anything using Segment Anything and inpainting models.☆7,591Feb 29, 2024Updated last year
- Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion☆7,751Dec 8, 2022Updated 3 years ago
- An AI-powered search engine with a generative UI☆8,558Updated this week
- A faster pytorch implementation of faster r-cnn☆7,867May 20, 2022Updated 3 years ago
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆9,694Aug 12, 2024Updated last year
- Running large language models on a single GPU for throughput-oriented scenarios.☆9,384Oct 28, 2024Updated last year
- 🍦 Never use print() to debug again.☆10,013Jan 21, 2026Updated 3 weeks ago
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬☆12,048Dec 19, 2025Updated last month
- Replace 'hub' with 'ingest' in any GitHub URL to get a prompt-friendly extract of a codebase☆13,875Feb 7, 2026Updated last week
- 📋 A list of open LLMs available for commercial use.☆12,632Feb 13, 2025Updated last year
- 🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.…☆12,159Dec 6, 2024Updated last year
- Easy-to-use and powerful LLM and SLM library with awesome model zoo.☆12,914Dec 17, 2025Updated last month
- An open source implementation of CLIP.☆13,353Nov 4, 2025Updated 3 months ago
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆85,210Updated this week
- Amplication brings order to the chaos of large-scale software development by creating Golden Paths for developers - streamlined workflows…☆16,007Updated this week
- Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …☆17,397Sep 5, 2024Updated last year
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,228Feb 4, 2026Updated last week
- Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"☆20,663Oct 17, 2025Updated 3 months ago
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone☆23,054Feb 6, 2026Updated last week
- Python logging made (stupidly) simple☆23,584Jan 15, 2026Updated 3 weeks ago
- Graph Neural Network Library for PyTorch☆23,469Updated this week
- Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Py…☆24,993Updated this week