ddidacus / llama-titansLinks

Adaptation of titans-pytorch to llama models on HF

☆16

Alternatives and similar repositories for llama-titans

Users that are interested in llama-titans are comparing it to the libraries listed below

Sorting:

ml-research / self-expanding-neural-networks
Self-Expanding Neural Networks
☆39Updated last year
meulemansalex / theoretical_framework_for_target_propagation
Python implementation of the methods in Meulemans et al. 2020 - A Theoretical Framework For Target Propagation
☆32Updated 7 months ago
Cranial-XIX / longhorn
Official PyTorch Implementation of the Longhorn Deep State Space Model
☆50Updated 6 months ago
machine-discovery / deer
Parallelizing non-linear sequential models over the sequence length
☆51Updated 4 months ago
goombalab / phi-mamba
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…
☆108Updated 8 months ago
hyperevolnet / Terminator
The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.
☆36Updated 2 months ago
shikaiqiu / compute-better-spent
☆53Updated 8 months ago
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆54Updated 9 months ago
Benjamin-Walker / selective-ssms-and-linear-cdes
Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)
☆12Updated 5 months ago
Leiay / looped_transformer
☆26Updated last year
ExplainableML / fomo_in_flux
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆56Updated 5 months ago
ejmichaud / grokking-squared
☆26Updated 2 years ago
GFNOrg / GFlowNet-EM
Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.
☆40Updated last year
andyjm3 / SLTrain
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)
☆31Updated 7 months ago
Weixin-Liang / Mixture-of-Mamba
☆43Updated 4 months ago
facebookresearch / MemoryMosaics
Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.
☆44Updated 4 months ago
Gothos / LRU-pytorch
Unofficial implementation of Linear Recurrent Units, by Deepmind, in Pytorch
☆69Updated last month
FarnoushRJ / MambaLRP
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
☆38Updated 7 months ago
fjzzq2002 / pizza
Code repository for "The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks"
☆16Updated last year
convergence-ai / lm2
Official repo of paper LM2
☆40Updated 3 months ago
cjyaras / deep-lora-transformers
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)
☆14Updated 10 months ago
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆222Updated last year
TrentBrick / SDMContinualLearner
☆17Updated 2 years ago
qu-gg / torch-hypernetwork-tutorials
Hypernetwork training considerations and implementation types in PyTorch. Includes classification and time-series examples alongside 1D G…
☆17Updated 2 years ago
ThomasYerxa / mmcr
☆20Updated 8 months ago
probabilistic-inference-scaling / probabilistic-inference-scaling
☆50Updated 2 months ago
AvivNavon / DWSNets
Official implementation for Equivariant Architectures for Learning in Deep Weight Spaces [ICML 2023]
☆89Updated last year
YefanZhou / TempBalance
[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆35Updated last month
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆104Updated 2 months ago
alxndrTL / othello_mamba
Evaluating the Mamba architecture on the Othello game
☆47Updated last year