☆50Jan 28, 2025Updated last year
Alternatives and similar repositories for Mixture-of-Mamba
Users that are interested in Mixture-of-Mamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 2 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Feb 4, 2026Updated last month
- KV cache compression via sparse coding☆17Oct 26, 2025Updated 5 months ago
- A PyTorch Deep Learning Kit☆12Apr 30, 2023Updated 2 years ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆52Nov 20, 2024Updated last year
- Fast and memory-efficient exact attention☆19Mar 9, 2026Updated 3 weeks ago
- GoldFinch and other hybrid transformer components☆45Jul 20, 2024Updated last year
- Histomic Prognostic Signature (HiPS): A population-level computational histologic signature for invasive breast cancer prognosis☆32Apr 9, 2024Updated last year
- Differential equation neural operator☆22Sep 4, 2023Updated 2 years ago
- Dynamic config system based on python classes☆12Jan 27, 2023Updated 3 years ago
- ☆11Feb 22, 2024Updated 2 years ago
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated last month
- Boosting Multi-view Stereo with Late Cost Aggregation☆13Jan 24, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A collection of resources and information for concrete skills that are helpful when pursuing a PhD in computer science (specifically in M…☆24Apr 18, 2023Updated 2 years ago
- UCLA CS 188 (Winter 2023) course project.☆12Mar 31, 2023Updated 3 years ago
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆25Feb 25, 2025Updated last year
- [CVPR 2025] 2DMamba: Efficient State Space Model for Image Representation☆80Jan 29, 2026Updated 2 months ago
- Decoding of the speech envelope from EEG using the VLAAI deep neural network☆15Sep 28, 2022Updated 3 years ago
- Enhancing Multi-Agent System Coordination in Autonomous Electric Vehicles Using Large Language Models☆20Dec 13, 2023Updated 2 years ago
- This is the offical repository for "Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion" (ICCV 2023).☆72Apr 30, 2024Updated last year
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated 11 months ago
- An open-source implementaion for fine-tuning DINOv2 by Meta.☆14Jul 21, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- XmodelLM☆38Nov 19, 2024Updated last year
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆37Aug 27, 2025Updated 7 months ago
- Principled learning method for Wasserstein distributionally robust optimization with local perturbations (ICML 2020)☆21Mar 24, 2023Updated 3 years ago
- Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.☆113Jan 14, 2026Updated 2 months ago
- ☆46Mar 31, 2025Updated 11 months ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆70May 18, 2025Updated 10 months ago
- ☆15Mar 20, 2025Updated last year
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆30Nov 22, 2025Updated 4 months ago
- Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…☆25Oct 13, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆60May 13, 2025Updated 10 months ago
- Video Diffusion State Space Models☆19Mar 27, 2024Updated 2 years ago
- [CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning☆40Apr 21, 2025Updated 11 months ago
- Dynamic Early Exit for Image Captioning☆17Oct 25, 2022Updated 3 years ago
- Efficient Computation and Analysis of Distributional Shapley Values (AISTATS 2021)☆22Oct 19, 2023Updated 2 years ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆86Mar 21, 2024Updated 2 years ago
- Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)☆10Oct 16, 2024Updated last year