jwzhanggy / tinyBIG
tinybig for deep function learning
β60Updated 3 months ago
Alternatives and similar repositories for tinyBIG:
Users that are interested in tinyBIG are comparing it to the libraries listed below
- β142Updated 6 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"β51Updated last month
- πΉοΈThe toy examples of Kolmogorov-Arnold Network (Get Started Quickly)β75Updated 10 months ago
- State Space Modelsβ66Updated 10 months ago
- [AAAI 2025] Official Implementation of "Auto-Regressive Moving Diffusion Models for Time Series Forecasting"β60Updated last month
- The official implementation for ICLR23 spotlight paper "DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion"β295Updated this week
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficienβ¦β87Updated last month
- β43Updated 5 months ago
- Awesome list of papers that extend Mamba to various applications.β132Updated 2 months ago
- Multi-Agent System for Science of Scienceβ69Updated 3 weeks ago
- A repository for DenseSSMsβ87Updated 11 months ago
- β209Updated last week
- Benchmark for efficiency in memory and time of different KAN implementations.β119Updated 6 months ago
- A pytorch implementation of Fourier Analysis Networks (FAN)β31Updated 5 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"β159Updated last month
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing oβ¦β43Updated 3 months ago
- β161Updated this week
- Decomposing and Editing Predictions by Modeling Model Computationβ138Updated 9 months ago
- Simbaβ202Updated 11 months ago
- β188Updated last year
- β127Updated 10 months ago
- This is the official code repository for "Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs", whβ¦β87Updated 9 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)β61Updated 2 weeks ago
- β41Updated 2 weeks ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Modelsβ41Updated 3 months ago
- An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft.β23Updated last month
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modelingβ188Updated last month
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Modelsβ265Updated 2 weeks ago
- MNIST example using Kolmogorov-Arnold Networksβ27Updated 10 months ago
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model β¦β55Updated 4 months ago