An Experiment on Dynamic NTK Scaling RoPE
☆64Nov 26, 2023Updated 2 years ago
Alternatives and similar repositories for Consistent-DynamicNTKRoPE
Users that are interested in Consistent-DynamicNTKRoPE are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,673Apr 17, 2024Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- NTK scaled version of ALiBi position encoding in Transformer.☆69Aug 16, 2023Updated 2 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209May 20, 2024Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction☆41Apr 19, 2023Updated 2 years ago
- Long Context Extension and Generalization in LLMs☆63Sep 21, 2024Updated last year
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆27Oct 3, 2025Updated 4 months ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆60May 28, 2024Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆48Mar 7, 2024Updated last year
- Rectified Rotary Position Embeddings☆389May 20, 2024Updated last year
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆50Sep 4, 2025Updated 5 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Dataset and code to reproduce the results of the paper "Evolving Structures in Complex Systems"☆11Dec 16, 2019Updated 6 years ago
- Implement Learning Efficient Convolutional Networks Through Network Slimming on YOLOX☆25Jun 9, 2022Updated 3 years ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆13Sep 2, 2024Updated last year
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models☆60Jul 23, 2024Updated last year
- Longitudinal Evaluation of LLMs via Data Compression☆33May 29, 2024Updated last year
- A fork of the PEFT library, supporting Robust Adaptation (RoSA)☆15Aug 16, 2024Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- Building language models to predict more than one token ahead to enable further ahead predictions.☆12May 22, 2025Updated 9 months ago
- ☆15Oct 20, 2023Updated 2 years ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- The Variational Homoencoder: Learning to learn high capacity generative models from few examples☆34Jul 13, 2023Updated 2 years ago
- ☆62Oct 29, 2024Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Nov 13, 2024Updated last year
- LongQLoRA: Extent Context Length of LLMs Efficiently☆168Nov 12, 2023Updated 2 years ago
- ☆24May 23, 2025Updated 9 months ago
- ☆21Jul 18, 2024Updated last year
- ☆23May 21, 2025Updated 9 months ago
- ☆20Mar 3, 2025Updated 11 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Updated this week
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago