Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)
☆47Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for mamba_small_bench
Users that are interested in mamba_small_bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ViT architecture with Mamba instead of transformer backbone☆17Dec 8, 2023Updated 2 years ago
- Understand what physics/algorithms do transformers learn internally when trained on planetary motion☆39Feb 9, 2026Updated last month
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated 11 months ago
- 3rd placed submission to the NeurIPS MineRL competition 2019☆10Mar 24, 2023Updated 3 years ago
- [RAL 2023] transformer + reinforcement learning for navigation + POMPD☆15Jul 19, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ProToPortal: The Portal to the Magic of PromptingTools and Julia-first LLM Coding☆17Jul 14, 2024Updated last year
- A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series…☆29Nov 11, 2024Updated last year
- Train text generation model with JavaScript.☆15Jul 14, 2024Updated last year
- ☆107Mar 9, 2024Updated 2 years ago
- Griffin MQA + Hawk Linear RNN Hybrid☆89Apr 26, 2024Updated last year
- Implementation of papers in 101 lines of code.☆18Nov 12, 2023Updated 2 years ago
- Implement PPO algorithm on mujoco environment,such as Ant-v2, Humanoid-v2, Hopper-v2, Halfcheeth-v2.☆57Jun 30, 2020Updated 5 years ago
- Flux reconstruction fluid flow solver for 1D PDEs written in Julia. Linear advection, Burgers, viscous Burgers, and Euler equations.☆13Apr 28, 2022Updated 3 years ago
- ☆11Jun 5, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆28Jun 9, 2024Updated last year
- Rust derive macros for automating the boring stuff.☆14Aug 3, 2025Updated 7 months ago
- Code for the paper "Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction" …☆12Sep 6, 2023Updated 2 years ago
- A machine learning library capable of training various deep neural networks (RNNs, LSTMs, DBNs, ect...) on a GPU. It makes use of auto-di…☆10Aug 28, 2018Updated 7 years ago
- Source code for the paper "LongGenBench: Long-context Generation Benchmark"☆23Oct 8, 2024Updated last year
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆19Jun 5, 2023Updated 2 years ago
- Encryption and signing for a post quantum world☆17Apr 4, 2023Updated 2 years ago
- Bleeding edge low level Rust binding for GGML☆16Jun 26, 2024Updated last year
- Autosuggestions for function keywords☆20Updated this week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Cross Atlas Remapping via Optimal Transport☆12Dec 14, 2023Updated 2 years ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Mar 8, 2026Updated 2 weeks ago
- Developing, training, and assessing the performance of a Proximal Policy Optimization (PPO) Stock Trading Agent.☆14Aug 20, 2025Updated 7 months ago
- A MATLAB function library containing encoders, decoders and weight enumerators for Reed-Muller codes.☆11Aug 19, 2023Updated 2 years ago
- ☆16Jul 7, 2025Updated 8 months ago
- Code for paper 'Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction' (TOMM 2023)☆10Sep 6, 2025Updated 6 months ago
- Almost SOTA LLM architecture, with O(n) time complexity☆11Jan 19, 2025Updated last year
- Evaluating methods for estimating aperiodic activity in electrophysiological data.☆16Sep 24, 2024Updated last year
- Audio-only Emotion Detection using Federated Learning☆10Dec 8, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Mar 9, 2024Updated 2 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 3 years ago
- ☆18Jul 7, 2020Updated 5 years ago
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- Repo for MCMC based Dynamic Topic Model☆16Sep 2, 2017Updated 8 years ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- Life before `main()`☆19Feb 2, 2021Updated 5 years ago