chijames/KERPLE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chijames/KERPLE)

chijames / KERPLE

☆20

Alternatives and similar repositories for KERPLE

Users that are interested in KERPLE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

McGill-NLP / length-generalization
View on GitHub
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
☆139Apr 30, 2024Updated 2 years ago
kyegomez / LM-Infinite
View on GitHub
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆40Nov 11, 2024Updated last year
dominiksinsaarland / document-level-FEVER
View on GitHub
☆13May 30, 2022Updated 4 years ago
YujieLu10 / Seeker
View on GitHub
☆11May 24, 2024Updated 2 years ago
VITA-Group / TAPE
View on GitHub
[ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…
☆15Jun 6, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Liuning-He / Brazil-Conference-Survival-Guide
View on GitHub
A practical bilingual guide to staying safe and prepared at conferences in Brazil / 巴西参会实用攻略与自救指南
☆15Apr 22, 2026Updated 2 months ago
lsj2408 / URPE
View on GitHub
[NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)
☆35Aug 6, 2023Updated 2 years ago
Pervasive-AI-Lab / LuckyMera
View on GitHub
☆16Oct 4, 2024Updated last year
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
ofirpress / attention_with_linear_biases
View on GitHub
Code for the ALiBi method for transformer language models (ICLR 2022)
☆558Oct 30, 2023Updated 2 years ago
AI4fun / DQ-LoRe
View on GitHub
☆13Jun 26, 2024Updated 2 years ago
bigai-nlco / CREAM
View on GitHub
[NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
☆22Oct 10, 2024Updated last year
konstantinosKokos / ape
View on GitHub
🧮 Algebraic Positional Encodings.
☆21Jun 5, 2026Updated last month
hpcgroup / loki
View on GitHub
Algorithms for approximate attention in LLMs
☆22Apr 14, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
HanseulJo / position-coupling
View on GitHub
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…
☆14Oct 26, 2025Updated 8 months ago
sunyt32 / torchscale
View on GitHub
Transformers at any scale
☆42Jan 18, 2024Updated 2 years ago
simonepri / fever-transformers
View on GitHub
📄 Evidence Retrieval and Claim Verification for the FEVER shared task using Transformer Networks
☆12Feb 21, 2020Updated 6 years ago
WooooDyy / Self-Polish
View on GitHub
Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…
☆33May 30, 2023Updated 3 years ago
metacarbon / shareAtt
View on GitHub
Beyond KV Caching: Shared Attention for Efficient LLMs
☆20Jul 19, 2024Updated 2 years ago
ZZZhr-1 / Robust_GUI_Grounding
View on GitHub
On the Robustness of GUI Grounding Models Against Image Attacks
☆12Apr 8, 2025Updated last year
BryceZhuo / HybridNorm
View on GitHub
The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
☆19Mar 7, 2025Updated last year
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
kyegomez / MultiQueryAttention
View on GitHub
This is a simple torch implementation of the high performance Multi-Query Attention
☆16Aug 23, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
sjelassi / transformers_ssm_copy
View on GitHub
☆40Feb 26, 2024Updated 2 years ago
BryceZhuo / PolyCom
View on GitHub
The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".
☆18Apr 25, 2025Updated last year
farnooshar / EigenClusterVIS
View on GitHub
☆13Dec 2, 2024Updated last year
mjalali / renyi-kernel-entropy
View on GitHub
[NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.
☆14Jun 18, 2025Updated last year
morganmcg1 / rotobart
View on GitHub
Pre-training BART in Flax on The Pile dataset
☆22Jul 24, 2021Updated 4 years ago
chuanyang-Zheng / DAPE
View on GitHub
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆41Oct 11, 2024Updated last year
kyegomez / MGQA
View on GitHub
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…
☆16Dec 11, 2023Updated 2 years ago
assafbk / OPRM
View on GitHub
Overflow Prevention Enhances Long-Context Recurrent LLMs (COLM 2025)
☆18Jul 8, 2025Updated last year
citiususc / DepPattern
View on GitHub
Dependency syntactic parser and formal grammar for Natural Languages
☆12Apr 29, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
UCDvision / PatchSearch
View on GitHub
Code for the CVPR '23 paper, "Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning"
☆10Jun 9, 2023Updated 3 years ago
ant-8 / persona-menu
View on GitHub
Persona 5 Game Menu for Web
☆18Jul 14, 2023Updated 3 years ago
ybisk / CCG-Induction
View on GitHub
Unsupervised Grammar Induction with Combinatory Categorial Grammars
☆10Jan 28, 2021Updated 5 years ago
CaptainSame / Singular-Value-Decomposition-and-CUR-matrix-approximation
View on GitHub
Implementation of data dimensionality reduction algorithms SVD and CUR without using library functions.
☆10Jul 24, 2017Updated 8 years ago
shawntan / stickbreaking-attention
View on GitHub
Stick-breaking attention
☆63Jul 1, 2025Updated last year
qiancheng0 / EscapeBench
View on GitHub
This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box
☆18Dec 19, 2024Updated last year
n3slami / Memento_Filter
View on GitHub
The first range filter to simultaneously offer dynamicity, fast operations, and a robust false positive rate for any workload.
☆13Jul 15, 2025Updated last year