jwzhanggy / tinyBIG
tinybig for deep function learning
☆60Updated 4 months ago
Alternatives and similar repositories for tinyBIG:
Users that are interested in tinyBIG are comparing it to the libraries listed below
- ☆144Updated 7 months ago
- ☆56Updated 2 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆52Updated 2 weeks ago
- ☆161Updated this week
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models☆278Updated last month
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆97Updated 2 weeks ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated 4 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆55Updated 3 weeks ago
- State Space Models☆68Updated 11 months ago
- ☆128Updated 11 months ago
- ☆85Updated 6 months ago
- A repository for DenseSSMs☆87Updated last year
- ☆189Updated last year
- DeepSeek Native Sparse Attention pytorch implementation☆61Updated last month
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆63Updated last month
- Multi-Agent System for Science of Science☆72Updated 3 weeks ago
- 🕹️The toy examples of Kolmogorov-Arnold Network (Get Started Quickly)☆75Updated 11 months ago
- ☆46Updated 6 months ago
- A pytorch implementation of Fourier Analysis Networks (FAN)☆34Updated 6 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆43Updated 5 months ago
- Awesome list of papers that extend Mamba to various applications.☆132Updated 2 weeks ago
- Decomposing and Editing Predictions by Modeling Model Computation☆138Updated 10 months ago
- ☆228Updated last month
- MNIST example using Kolmogorov-Arnold Networks☆27Updated 11 months ago
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆102Updated 2 weeks ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆166Updated 2 weeks ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆66Updated 9 months ago
- [AAAI 2025] Official Implementation of "Auto-Regressive Moving Diffusion Models for Time Series Forecasting"☆69Updated 2 months ago
- A generalized framework for subspace tuning methods in parameter efficient fine-tuning.☆138Updated 2 months ago
- [NeurIPS 2024] Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling☆22Updated 6 months ago