BestAnHongjun / SentenceVAE
Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context
☆23Updated 5 months ago
Alternatives and similar repositories for SentenceVAE:
Users that are interested in SentenceVAE are comparing it to the libraries listed below
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆68Updated this week
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆40Updated 6 months ago
- ☆47Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆75Updated 3 months ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆59Updated 3 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆67Updated last month
- Touchstone: Evaluating Vision-Language Models by Language Models☆80Updated last year
- FocusLLM: Scaling LLM’s Context by Parallel Decoding☆34Updated last month
- ☆15Updated 5 months ago
- The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆117Updated last week
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆78Updated 3 months ago
- A huge dataset for Document Visual Question Answering☆15Updated 5 months ago
- ☆37Updated 2 months ago
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆46Updated 5 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆26Updated 6 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆28Updated 6 months ago
- ☆49Updated last week
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆76Updated 6 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated 3 months ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆50Updated last month
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆28Updated 3 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆33Updated 3 months ago
- ☆26Updated 4 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆17Updated last month
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆127Updated 2 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆75Updated 2 months ago
- ☆28Updated this week
- PyTorch implementation of StableMask (ICML'24)☆12Updated 6 months ago
- Codebase for Instruction Following without Instruction Tuning☆33Updated 3 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year