[NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"
☆80Mar 12, 2026Updated 2 months ago
Alternatives and similar repositories for HMT-pytorch
Users that are interested in HMT-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open-source AI acceleration on FPGA: from ONNX to RTL☆54May 8, 2026Updated last week
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆53Aug 28, 2024Updated last year
- ICLR 2023: Learning to Extrapolate: A Transductive Approach☆11Aug 15, 2023Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)☆63Oct 8, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- A repository for research on medium sized language models.☆78May 23, 2024Updated last year
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆39Sep 24, 2024Updated last year
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated 2 years ago
- ☆84Nov 10, 2025Updated 6 months ago
- Differentiable Clustering with Perturbed Random Forests, NeurIPS2023☆13Oct 16, 2023Updated 2 years ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆45Aug 6, 2024Updated last year
- Gradient-based Hyperparameter Optimization Over Long Horizons☆14Sep 29, 2021Updated 4 years ago
- Hybrid Deep Sequential Modeling for Social Text-Driven Stock Prediction-Dataset☆22Aug 19, 2018Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official Code Repository for the paper "Key-value memory in the brain"☆31Feb 25, 2025Updated last year
- Code for the paper "FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024"☆13Feb 14, 2025Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- [AAAI'26] Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression☆19Dec 21, 2025Updated 4 months ago
- ☆17Aug 1, 2025Updated 9 months ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆25Jun 6, 2024Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Code for ICML 2024 paper☆36Sep 18, 2025Updated 8 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆16Dec 9, 2023Updated 2 years ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated last year
- Adaptation of titans-pytorch to llama models on HF☆25Mar 6, 2025Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆62Dec 10, 2024Updated last year
- ☆42Mar 28, 2024Updated 2 years ago
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆156Apr 7, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆35Mar 12, 2026Updated 2 months ago
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- Large-scale medical image processing and reconstruction toolbox☆18Feb 13, 2024Updated 2 years ago
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆21Jan 8, 2025Updated last year
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference☆47Jun 4, 2024Updated last year
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆60May 28, 2024Updated last year
- ☆23Oct 22, 2025Updated 6 months ago