☆50Jan 28, 2025Updated last year
Alternatives and similar repositories for Mixture-of-Mamba
Users that are interested in Mixture-of-Mamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 3 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Feb 4, 2026Updated 2 months ago
- KV cache compression via sparse coding☆17Oct 26, 2025Updated 6 months ago
- [NeurIPS 2025🔥:] EVODiff is an inference-time refinement method for diffusion models that improves sampling efficiency and generative f…☆31Feb 2, 2026Updated 2 months ago
- ☆19Nov 4, 2025Updated 5 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- Histomic Prognostic Signature (HiPS): A population-level computational histologic signature for invasive breast cancer prognosis☆33Apr 9, 2024Updated 2 years ago
- Dynamic config system based on python classes☆12Jan 27, 2023Updated 3 years ago
- Simple repository for training small reasoning models☆50Feb 17, 2026Updated 2 months ago
- A collection of resources and information for concrete skills that are helpful when pursuing a PhD in computer science (specifically in M…☆23Apr 18, 2023Updated 3 years ago
- #UAI2020 Codes for PAC-Bayesian Contrastive Unsupervised Representation Learning☆14May 23, 2022Updated 3 years ago
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆25Feb 25, 2025Updated last year
- Enhancing Multi-Agent System Coordination in Autonomous Electric Vehicles Using Large Language Models☆20Dec 13, 2023Updated 2 years ago
- This is the offical repository for "Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion" (ICCV 2023).☆73Apr 30, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated last year
- ☆21Apr 30, 2023Updated 3 years ago
- XmodelLM☆38Nov 19, 2024Updated last year
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆37Aug 27, 2025Updated 8 months ago
- Principled learning method for Wasserstein distributionally robust optimization with local perturbations (ICML 2020)☆21Mar 24, 2023Updated 3 years ago
- Efficient Finetuning for OpenAI GPT-OSS☆24Oct 2, 2025Updated 6 months ago
- Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.☆113Jan 14, 2026Updated 3 months ago
- Segment Anything with Webcam in Real-Time with FastSAM☆10Nov 19, 2023Updated 2 years ago
- ☆15Mar 20, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers" [ICCV 2025]☆101Jul 28, 2025Updated 9 months ago
- A LLM client for use from the command line or IDE. 一个在命令行或者IDE中使用的大语言模型客户端☆15Mar 5, 2026Updated last month
- [ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…☆31Apr 14, 2026Updated 2 weeks ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆73May 18, 2025Updated 11 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆17Mar 26, 2025Updated last year
- [CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning☆42Apr 21, 2025Updated last year
- PyTorch implementation of Retentive Network: A Successor to Transformer for Large Language Models☆14Jul 20, 2023Updated 2 years ago
- ☆60May 13, 2025Updated 11 months ago
- Efficient Computation and Analysis of Distributional Shapley Values (AISTATS 2021)☆22Oct 19, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆14Dec 12, 2024Updated last year
- ☆16Feb 23, 2025Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆89Mar 27, 2026Updated last month
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Dec 1, 2024Updated last year
- ☆13Oct 30, 2023Updated 2 years ago
- ☆10Apr 8, 2018Updated 8 years ago
- Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images☆36Jun 3, 2025Updated 10 months ago