OpenLMLab/scaling-rope

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenLMLab/scaling-rope)

OpenLMLab / scaling-rope

code for Scaling Laws of RoPE-based Extrapolation

☆73

Alternatives and similar repositories for scaling-rope

Users that are interested in scaling-rope are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenLMLab / ParallelTokenizer
View on GitHub
Use the tokenizer in parallel to achieve superior acceleration
☆20Mar 21, 2024Updated 2 years ago
OpenMOSS / Thus-Spake-Long-Context-LLM
View on GitHub
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
☆62Mar 31, 2025Updated last year
OpenLMLab / LongWanjuan
View on GitHub
Towards Systematic Measurement for Long Text Quality
☆39Sep 5, 2024Updated last year
xinghaow99 / prism
View on GitHub
[ICML 2026] Prism: Spectral-Aware Block-Sparse Attention
☆27May 22, 2026Updated 2 months ago
InternLM / InternLM-WQX
View on GitHub
☆19Jul 5, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xinghaow99 / pbs-attn
View on GitHub
[ICML 2026] Sparser Block-Sparse Attention via Token Permutation
☆31May 22, 2026Updated 2 months ago
jungokasai / T2R
View on GitHub
☆14Nov 20, 2022Updated 3 years ago
LouChao98 / nner_as_parsing
View on GitHub
☆16Mar 22, 2023Updated 3 years ago
KwanWaiChung / M4LE
View on GitHub
Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models
☆23Jul 27, 2024Updated last year
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
codefuse-ai / Collinear-Constrained-Attention
View on GitHub
☆62Jun 17, 2024Updated 2 years ago
robert-lieck / RBN
View on GitHub
Recursive Bayesian Networks
☆11May 11, 2025Updated last year
tengxiaoliu / LM_skip
View on GitHub
[NeurIPS 2024] Can Language Models Learn to Skip Steps?
☆21Jan 25, 2025Updated last year
keezen / ntk_alibi
View on GitHub
NTK scaled version of ALiBi position encoding in Transformer.
☆69Aug 16, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
VPeterV / RankSpace-Models
View on GitHub
source code for NAACL2022 main conference "Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs"
☆10Sep 26, 2022Updated 3 years ago
emorynlp / seq2seq-corenlp
View on GitHub
☆13Feb 7, 2023Updated 3 years ago
0nutation / SpeechGPT2.github.io
View on GitHub
☆12Jul 23, 2024Updated 2 years ago
OpenLMLab / ChatZoo
View on GitHub
Light local website for displaying performances from different chat models.
☆86Nov 13, 2023Updated 2 years ago
DAMO-NLP-SG / CLEX
View on GitHub
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models
☆78Mar 12, 2024Updated 2 years ago
AntNLP / nope_head_scale
View on GitHub
☆29May 4, 2024Updated 2 years ago
VITA-Group / Data-Efficient-Scaling
View on GitHub
[ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang
☆14Jan 4, 2024Updated 2 years ago
rycolab / aflt-f2023
View on GitHub
Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)
☆10Feb 21, 2023Updated 3 years ago
jquesnelle / yarn
View on GitHub
YaRN: Efficient Context Window Extension of Large Language Models
☆1,740Apr 17, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
CStanKonrad / long_llama
View on GitHub
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…
☆1,465Nov 7, 2023Updated 2 years ago
RakitinDen / pytorch-recursive-gumbel-max-trick
View on GitHub
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces, NeurIPS 2021
☆14Dec 11, 2021Updated 4 years ago
Twilight92z / Quantize-Watermark
View on GitHub
☆19Nov 6, 2023Updated 2 years ago
yegcjs / mixinglaws
View on GitHub
☆113Jul 15, 2025Updated last year
bigai-nlco / LooGLE
View on GitHub
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
☆199Oct 8, 2024Updated last year
Shark-NLP / CAB
View on GitHub
☆31Jul 2, 2023Updated 3 years ago
teffland / ner-expected-entity-ratio
View on GitHub
Implementation and experiments for Partially Supervised NER via Expected Entity Ratio in TACL 2022
☆14Nov 7, 2022Updated 3 years ago
OpenMOSS / LongLLaDA
View on GitHub
[AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
☆55Dec 7, 2025Updated 7 months ago
OpenLMLab / Sniffer
View on GitHub
☆27Jun 5, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ShannonAI / mrc-for-dependency-parsing
View on GitHub
☆18May 28, 2021Updated 5 years ago
tengxiaoliu / XoT
View on GitHub
[EMNLP 2023] Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts
☆27Nov 4, 2023Updated 2 years ago
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆502Mar 19, 2024Updated 2 years ago
sii-research / OpenMOSS
View on GitHub
OpenMOSS presents a collection of our research on LLMs, supported by SII, Fudan and Mosi.
☆30Updated this week
zsLin177 / SRL-as-GP
View on GitHub
☆18Mar 10, 2023Updated 3 years ago
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
OpenMOSS / rope_pp
View on GitHub
[ICLR26] Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs
☆33Dec 9, 2025Updated 7 months ago