kyegomez / VortexFusionLinks
Transformers + Mambas + LSTMS All in One Model
☆9Updated 3 weeks ago
Alternatives and similar repositories for VortexFusion
Users that are interested in VortexFusion are comparing it to the libraries listed below
Sorting:
- ☆18Updated 8 months ago
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆19Updated last year
- A regression-alike loss to improve numerical reasoning in language models☆20Updated this week
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated last year
- ☆18Updated 5 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated 2 months ago
- ☆43Updated 5 months ago
- We study toy models of skill learning.☆29Updated 6 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning☆63Updated 4 months ago
- The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Imag…☆11Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 9 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆25Updated 4 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated 3 weeks ago
- ☆36Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆35Updated last year
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆39Updated 2 weeks ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆28Updated 5 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆60Updated 9 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆21Updated 3 months ago
- ☆47Updated 5 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆19Updated 4 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆85Updated 6 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆75Updated 3 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆60Updated 4 months ago
- Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆30Updated last month
- Unofficial Implementation of Selective Attention Transformer☆17Updated 8 months ago
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆38Updated last year
- ☆46Updated 2 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆116Updated last year