kyegomez / Paper-Implementation-Template
A simple reproducible template to implement AI research papers
☆22Updated 4 months ago
Alternatives and similar repositories for Paper-Implementation-Template:
Users that are interested in Paper-Implementation-Template are comparing it to the libraries listed below
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆96Updated 4 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆24Updated this week
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆15Updated 2 months ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated this week
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆16Updated 2 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆40Updated 7 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆19Updated this week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆80Updated 2 weeks ago
- [NeurIPS2024] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆27Updated last month
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆25Updated 5 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆149Updated last month
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆72Updated 3 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆48Updated 3 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆79Updated this week
- OLA-VLM: Elevating Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆47Updated this week
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆188Updated 3 weeks ago
- Code for paper "Patch-Level Training for Large Language Models"☆77Updated 2 months ago
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆88Updated 10 months ago
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆147Updated last month
- Language Repository for Long Video Understanding☆31Updated 7 months ago
- ☆31Updated last week
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆46Updated 6 months ago
- Explorations into improving ViTArc with Slot Attention☆37Updated 3 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated this week
- [CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆98Updated 10 months ago
- ☆50Updated 3 weeks ago
- Explore the Limits of Omni-modal Pretraining at Scale☆96Updated 4 months ago
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆59Updated last year
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆21Updated last month
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆128Updated 2 months ago