☆29May 4, 2024Updated 2 years ago
Alternatives and similar repositories for nope_head_scale
Users that are interested in nope_head_scale are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.☆24Mar 5, 2024Updated 2 years ago
- Some preliminary explorations of Mamba's context scaling.☆219Feb 8, 2024Updated 2 years ago
- ☆62Jun 17, 2024Updated last year
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)☆12Mar 7, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆64Jul 30, 2023Updated 2 years ago
- Universal data IO and neural network modules in NLP tasks.☆18Apr 13, 2026Updated 3 weeks ago
- ☆15Dec 5, 2019Updated 6 years ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆250Sep 12, 2025Updated 7 months ago
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆57Dec 4, 2024Updated last year
- ☆26Feb 26, 2026Updated 2 months ago
- ☆16Mar 22, 2023Updated 3 years ago
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails☆33Feb 26, 2025Updated last year
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- ☆16Dec 9, 2023Updated 2 years ago
- ECNU NLP group learns CS224n in the form of seminars in the 2017 summer.☆10Aug 12, 2017Updated 8 years ago
- code for Scaling Laws of RoPE-based Extrapolation☆73Oct 16, 2023Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- ☆16Mar 13, 2023Updated 3 years ago
- ☆13Oct 14, 2024Updated last year
- ☆107Mar 9, 2024Updated 2 years ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆78Oct 16, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of GateLoop Transformer in Pytorch and Jax