☆20Oct 25, 2022Updated 3 years ago
Alternatives and similar repositories for KERPLE
Users that are interested in KERPLE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆39Apr 8, 2023Updated 3 years ago
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆15Jun 6, 2025Updated last year
- ☆11May 24, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆47Dec 12, 2023Updated 2 years ago
- ☆13May 30, 2022Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆35Aug 6, 2023Updated 2 years ago
- ☆16Oct 4, 2024Updated last year
- The is the official implementation of "Lyra: Orchestrating Dual Correction in Automated Theorem Proving"☆15Jul 2, 2024Updated last year
- Efficient PScan implementation in PyTorch☆17Jan 2, 2024Updated 2 years ago
- Code for the ALiBi method for transformer language models (ICLR 2022)☆556Oct 30, 2023Updated 2 years ago
- Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…☆14Oct 26, 2025Updated 8 months ago
- Algorithms for approximate attention in LLMs☆22Apr 14, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆13Jun 26, 2024Updated 2 years ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- 🧮 Algebraic Positional Encodings.☆21Jun 5, 2026Updated 3 weeks ago
- Code repository of AI-Endo☆16Jan 16, 2024Updated 2 years ago
- ☆17Oct 31, 2023Updated 2 years ago
- Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…☆33May 30, 2023Updated 3 years ago
- 📄 Evidence Retrieval and Claim Verification for the FEVER shared task using Transformer Networks☆12Feb 21, 2020Updated 6 years ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- This is a simple torch implementation of the high performance Multi-Query Attention