☆260Jun 6, 2025Updated 10 months ago
Alternatives and similar repositories for meliad
Users that are interested in meliad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆643Jul 17, 2023Updated 2 years ago
- Implementation of Block Recurrent Transformer - Pytorch☆224Aug 20, 2024Updated last year
- The official Languini Kitchen repository☆14May 6, 2024Updated last year
- ☆53Jan 19, 2023Updated 3 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- ☆13Aug 23, 2024Updated last year
- An implementation of local windowed attention for language modeling☆498Jul 16, 2025Updated 9 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- playing with gpt4☆14Mar 17, 2023Updated 3 years ago
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- ☆23Oct 15, 2022Updated 3 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Convolutions for Sequence Modeling☆911Jun 13, 2024Updated last year
- Understand and test language model architectures on synthetic tasks.☆265Mar 22, 2026Updated 3 weeks ago
- Sequence Modeling with Structured State Spaces☆67Aug 2, 2022Updated 3 years ago
- Large Context Attention