shreydan / VisionGPT2
Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.
☆42Updated last year
Alternatives and similar repositories for VisionGPT2:
Users that are interested in VisionGPT2 are comparing it to the libraries listed below
- The training notebooks that were similar to the original script used to train TinyMistral.☆21Updated last year
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated 11 months ago
- From scratch implementation of a vision language model in pure PyTorch☆207Updated 10 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆42Updated 10 months ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆64Updated 6 months ago
- ☆126Updated 7 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆230Updated 4 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- Cerule - A Tiny Mighty Vision Model☆67Updated 6 months ago
- minimal GRPO implementation from scratch☆65Updated 2 weeks ago
- Notebooks for fine tuning pali gemma☆98Updated 3 months ago
- Collection of autoregressive model implementation☆83Updated last month
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- nanogpt turned into a chat model☆65Updated last year
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆105Updated 5 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 10 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 10 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 5 months ago
- ☆14Updated this week
- Prune transformer layers☆68Updated 10 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆99Updated last year
- model activation visualiser☆90Updated this week
- Easy to use, High Performant Knowledge Distillation for LLMs☆55Updated this week
- Maybe the new state of the art vision model? we'll see 🤷♂️☆161Updated last year
- ☆63Updated 6 months ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 3 months ago
- ☆32Updated last month
- LoRA and DoRA from Scratch Implementations☆199Updated last year