BestAnHongjun / SentenceVAE
Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context
β25Updated 6 months ago
Alternatives and similar repositories for SentenceVAE:
Users that are interested in SentenceVAE are comparing it to the libraries listed below
- β32Updated last month
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*β89Updated last month
- [NeurIPS-2024] π Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623β77Updated 4 months ago
- β61Updated last month
- β64Updated 2 weeks ago
- A Framework for Decoupling and Assessing the Capabilities of VLMsβ40Updated 7 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"β35Updated 4 months ago
- [ICLR 2025] SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insightsβ53Updated last week
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"β81Updated last week
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Explorationβ23Updated last month
- Code for paper "Patch-Level Training for Large Language Models"β80Updated 3 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Modelsβ43Updated 2 weeks ago
- β47Updated last year
- The official implementation of Self-Exploring Language Models (SELM)β61Updated 8 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encodingβ29Updated 2 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.β63Updated 8 months ago
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language modelβ33Updated 3 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"β29Updated 7 months ago
- β12Updated last month
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Modelsβ38Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"β29Updated last month
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paperβ28Updated 8 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modelingβ44Updated last month
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervisionβ59Updated 7 months ago
- [ICML'24] The official implementation of βRethinking Optimization and Architecture for Tiny Language Modelsββ120Updated last month
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Modelsβ76Updated 11 months ago