kyegomez / VortexFusionLinks
Transformers + Mambas + LSTMS All in One Model
☆14Updated last week
Alternatives and similar repositories for VortexFusion
Users that are interested in VortexFusion are comparing it to the libraries listed below
Sorting:
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆14Updated last year
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated last month
- ☆50Updated 11 months ago
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆27Updated 5 months ago
- ☆18Updated last year
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated 2 years ago
- Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise☆40Updated last year
- We study toy models of skill learning.☆31Updated last year
- Implementation of Infini-Transformer in Pytorch☆112Updated last year
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆119Updated last week
- Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs☆67Updated last year
- A repository for DenseSSMs☆88Updated last year
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated 8 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆60Updated 7 months ago
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆19Updated last year
- [AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs☆51Updated last month
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆124Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆35Updated last year
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆122Updated last year
- ☆13Updated 8 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆38Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Updated last year
- A repository for research on medium sized language models.☆77Updated last year
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆79Updated 5 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆56Updated 2 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆203Updated 2 weeks ago
- ☆152Updated last year
- [ICLR 2024] Unveiling the Pitfalls of Knowledge Editing for Large Language Models☆22Updated last year