thunlp / InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
☆274Updated 5 months ago
Related projects: ⓘ
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆333Updated last month
- An easy-to-use LLM quantization and inference toolkit based on GPTQ algorithm (weight-only quantization).☆90Updated this week
- ☆262Updated this week
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆244Updated last week
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆608Updated last month
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆416Updated 6 months ago
- ☆268Updated this week
- Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality s…☆398Updated this week
- OLMoE: Open Mixture-of-Experts Language Models☆356Updated last week
- LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation☆194Updated 4 months ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆119Updated 2 months ago
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆595Updated 3 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆195Updated 4 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆337Updated last month
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆539Updated 6 months ago
- ☆284Updated 3 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆143Updated 5 months ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆361Updated 2 months ago
- Arena-Hard-Auto: An automatic LLM benchmark.☆421Updated 2 weeks ago
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆209Updated this week
- [ACL 2024] Progressive LLaMA with Block Expansion.☆464Updated 4 months ago
- Official repository for LongChat and LongEval☆505Updated 3 months ago
- To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up t…☆698Updated last week
- For releasing code related to compression methods for transformers, accompanying our publications☆356Updated 2 weeks ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆158Updated 4 months ago
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆153Updated 3 months ago
- The homepage of OneBit model quantization framework.☆136Updated 2 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆232Updated 6 months ago
- ☆185Updated last month
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆268Updated last week