HKUDS / SepLLM
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
☆43Updated last month
Alternatives and similar repositories for SepLLM:
Users that are interested in SepLLM are comparing it to the libraries listed below
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View☆46Updated 3 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆39Updated 3 months ago
- Accepted LLM Papers in NeurIPS 2024☆33Updated 3 months ago
- ☆138Updated 4 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆43Updated 2 months ago
- [NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"☆58Updated 2 months ago
- ☆21Updated 6 months ago
- ☆26Updated 3 months ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆10Updated 7 months ago
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆124Updated 5 months ago
- ☆17Updated 3 months ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆21Updated 7 months ago
- Codes for Merging Large Language Models☆28Updated 5 months ago
- [ICML2024] "LLaGA: Large Language and Graph Assistant", Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang☆92Updated 4 months ago
- ☆122Updated 6 months ago
- ☆71Updated last month
- [NeurIPS 2024] GITA: Graph to Image-Text Integration for Vision-Language Graph Reasoning☆45Updated 2 months ago
- [KDD'2024] "HiGPT: Heterogenous Graph Language Models"☆118Updated 7 months ago
- ☆31Updated last week
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆106Updated 2 months ago
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆41Updated 2 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆60Updated 9 months ago
- [EMNLP 2024 Findings🔥] Official implementation of "LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Infe…☆89Updated 2 months ago
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆16Updated 4 months ago
- ☆55Updated 2 months ago
- Towards Modality Generalization: A Benchmark and Prospective Analysis☆18Updated 2 weeks ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆90Updated 3 months ago
- [RelKD'24] Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models☆88Updated 3 months ago
- AnchorAttention: Improved attention for LLMs long-context training☆203Updated 2 weeks ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆48Updated last month