Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆179Sep 12, 2024Updated last year
Alternatives and similar repositories for CALM-pytorch
Users that are interested in CALM-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- ☆56Aug 6, 2025Updated 7 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆74Sep 16, 2024Updated last year
- Explorations into the recently proposed Taylor Series Linear Attention☆100Aug 18, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆88Jul 9, 2023Updated 2 years ago
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆37Sep 23, 2024Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- ☆16Dec 9, 2023Updated 2 years ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆135Oct 15, 2025Updated 5 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.☆514May 20, 2024Updated last year
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆15Mar 11, 2024Updated 2 years ago
- Implementation of a multimodal diffusion transformer in Pytorch☆107Jun 24, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Jan 21, 2025Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.☆109Dec 20, 2024Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆379Jun 17, 2024Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- Code for paper Rethinking the Data Annotation Process for Multi-view 3D Pose Estimation with Active Learning and Self-Training☆22Apr 18, 2023Updated 2 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- Implementation of a holodeck, written in Pytorch☆18Nov 1, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,234May 8, 2024Updated last year
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 7 months ago
- Implementation of Block Recurrent Transformer - Pytorch☆224Aug 20, 2024Updated last year
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆231Sep 6, 2024Updated last year
- Official implementation for "Mixture of In-Context Experts Enhance LLMs’ Awareness of Long Contexts" (Accepted by Neurips2024)☆13Jan 7, 2025Updated last year
- ☆12Jul 16, 2024Updated last year
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- ☆22Dec 11, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Sep 14, 2025Updated 6 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Oct 18, 2023Updated 2 years ago
- Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch☆55Nov 25, 2024Updated last year
- ☆77Apr 29, 2024Updated last year
- ☆12Apr 17, 2024Updated last year
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆250Jun 6, 2025Updated 9 months ago