kyegomez / VortexFusionLinks
Transformers + Mambas + LSTMS All in One Model
☆11Updated last week
Alternatives and similar repositories for VortexFusion
Users that are interested in VortexFusion are comparing it to the libraries listed below
Sorting:
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated 2 years ago
- ☆18Updated 11 months ago
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆25Updated last month
- ☆48Updated 7 months ago
- We study toy models of skill learning.☆31Updated 8 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated last week
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆40Updated 2 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆61Updated 11 months ago
- ☆21Updated 4 months ago
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆110Updated 2 weeks ago
- Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise☆37Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆36Updated last year
- ☆18Updated last month
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆48Updated 4 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆56Updated 3 months ago
- Lottery Ticket Adaptation☆39Updated 10 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆106Updated last week
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆16Updated 6 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆23Updated 2 weeks ago
- A repository for DenseSSMs☆88Updated last year
- Implementation of Infini-Transformer in Pytorch☆111Updated 8 months ago
- Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs☆65Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆119Updated last year
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆19Updated last year
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆20Updated this week
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆68Updated last month
- [AAAI 2024] SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research☆29Updated last year
- ☆50Updated 4 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆38Updated 10 months ago
- Control LLM☆19Updated 5 months ago