togethercomputer / stripedhyena
Repository for StripedHyena, a state-of-the-art beyond Transformer architecture
☆331Updated 10 months ago
Alternatives and similar repositories for stripedhyena:
Users that are interested in stripedhyena are comparing it to the libraries listed below
- Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena☆629Updated 7 months ago
- Bi-Directional Equivariant Long-Range DNA Sequence Modeling☆171Updated 2 weeks ago
- Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders☆142Updated 2 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆102Updated last month
- 🧬 Generative modeling of regulatory DNA sequences with diffusion probabilistic models 💨☆375Updated this week
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆52Updated last year
- (Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307…☆50Updated last year
- Implementation of the Llama architecture with RLHF + Q-learning☆157Updated last year
- Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch☆452Updated 3 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆219Updated last month
- ☆60Updated last year
- Biological foundation modeling from molecular to genome scale☆1,256Updated last month
- 🧬 Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics☆553Updated 3 months ago
- ☆180Updated this week
- Understand and test language model architectures on synthetic tasks.☆177Updated 2 weeks ago
- Benchmarking DNA Language Models on Biologically Meaningful Tasks☆102Updated 2 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆545Updated last month
- Benchmarks for classification of genomic sequences☆126Updated 11 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆291Updated last month
- Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch☆634Updated last month
- Orthrus is a mature RNA model for RNA property prediction. It uses a mamba encoder backbone, a variant of state-space models specifical…☆52Updated last week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆205Updated last month
- Repository for code used in the xVal paper☆128Updated 9 months ago
- Simplified Masked Diffusion Language Model☆262Updated 2 months ago
- Gymnasium framework for training language model agents on constructive tasks☆127Updated this week
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆156Updated this week
- Beyond Language Models: Byte Models are Digital World Simulators☆318Updated 7 months ago
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆220Updated 3 weeks ago
- GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics☆123Updated 5 months ago
- ChatCell: Facilitating Single-Cell Analysis with Natural Language☆48Updated 11 months ago