ant-research/long-context-modeling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ant-research/long-context-modeling)

ant-research / long-context-modeling

Research work aimed at addressing the problem of modeling infinite-length context

☆50

Alternatives and similar repositories for long-context-modeling

Users that are interested in long-context-modeling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sustcsonglin / second-order-neural-dmv
View on GitHub
source code of COLING2020 "Second-Order Unsupervised Neural Dependency Parsing"
☆16Oct 24, 2022Updated 3 years ago
whyNLP / Probabilistic-Transformer
View on GitHub
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
☆26Oct 22, 2023Updated 2 years ago
shawntan / stickbreaking-attention
View on GitHub
Stick-breaking attention
☆63Jul 1, 2025Updated last year
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
LouChao98 / neural_based_dmv
View on GitHub
☆22Apr 14, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WinnieHAN / mndmv
View on GitHub
☆12Mar 4, 2022Updated 4 years ago
wangxinyu0922 / Second_Order_Parsing
View on GitHub
[ACL 2019/AACL 2020] Second-Order Syntactic/Semantic Dependency Parsing With Mean Field Variational Inference (PyTorch)
☆14Oct 22, 2020Updated 5 years ago
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆18Jun 2, 2024Updated 2 years ago
huangyuxiang03 / Locret
View on GitHub
☆14Oct 3, 2024Updated last year
mit-han-lab / flash-moba
View on GitHub
☆251Nov 19, 2025Updated 8 months ago
JRC1995 / Continuous-RvNN
View on GitHub
Official Repository for "Modeling Hierarchical Structures with Continuous Recursive Neural Networks" (ICML 2021)
☆12Aug 18, 2021Updated 4 years ago
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 6 months ago
LouChao98 / nner_as_parsing
View on GitHub
☆16Mar 22, 2023Updated 3 years ago
whyNLP / PCCoT
View on GitHub
Parallel Continuous Chain-of-Thought with Jacobi Iteration. Accepted to EMNLP 2025.
☆23Mar 29, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
whyNLP / LCKV
View on GitHub
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…
☆157Apr 7, 2025Updated last year
zhaoyd1 / Dep_Transformer_Grammars
View on GitHub
☆16Oct 16, 2024Updated last year
Tencent-Hunyuan / HiLS-Attention
View on GitHub
Official code for HiLS-Attention
☆124Jul 14, 2026Updated last week
thunlp / APB
View on GitHub
Official Implementation of APB (ACL 2025 main Oral) and Spava (ACL 2026 main).
☆37Apr 6, 2026Updated 3 months ago
VITA-Group / LoCoCo
View on GitHub
[ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen
☆17Sep 7, 2024Updated last year
sustcsonglin / TN-PCFG
View on GitHub
source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conferenc…
☆52Mar 28, 2025Updated last year
thunlp / hybrid-linear-attention
View on GitHub
Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long …
☆40Apr 9, 2026Updated 3 months ago
goombalab / raven
View on GitHub
☆78May 29, 2026Updated last month
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
tilde-research / nsa-release
View on GitHub
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆133Jun 24, 2025Updated last year
Alibaba-NLP / AIN
View on GitHub
Code for our EMNLP 2020 Paper "AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network"
☆19Nov 14, 2022Updated 3 years ago
assafbk / OPRM
View on GitHub
Overflow Prevention Enhances Long-Context Recurrent LLMs (COLM 2025)
☆18Jul 8, 2025Updated last year
BryceZhuo / PolyCom
View on GitHub
The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".
☆18Apr 25, 2025Updated last year
sustcsonglin / span-based-dependency-parsing
View on GitHub
Source code of ACL2022 "Headed-Span-Based Projective Dependency Parsing" and "Combining (second-order) graph-based and headed-span-based …
☆16Jan 12, 2023Updated 3 years ago
da03 / criticize_text_generation
View on GitHub
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆12Mar 18, 2023Updated 3 years ago
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
SalesforceAIResearch / PretrainRL-pipeline
View on GitHub
An automated data pipeline scaling RL to pretraining levels
☆76Jun 2, 2026Updated last month
howard-hou / RWKV-X
View on GitHub
RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…
☆59Mar 31, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
RodkinIvan / associative-recurrent-memory-transformer
View on GitHub
[ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluation
☆66Mar 12, 2026Updated 4 months ago
deep-spin / OpenNMT-entmax
View on GitHub
☆15May 14, 2019Updated 7 years ago
sustcsonglin / disco-pointer
View on GitHub
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …
☆14Aug 25, 2023Updated 2 years ago
MikeWangWZHL / dymu
View on GitHub
☆29May 13, 2025Updated last year
tukw / unsupervised-parsing-tutorial
View on GitHub
Unsupervised Natural Language Parsing (Tutorial)
☆22Apr 19, 2021Updated 5 years ago
PKU-ML / LongPPL
View on GitHub
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
☆116Oct 11, 2025Updated 9 months ago
hamishivi / tess-2
View on GitHub
Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"
☆58Feb 20, 2025Updated last year