kuleshov-group / caduceus
Bi-Directional Equivariant Long-Range DNA Sequence Modeling
☆176Updated 2 months ago
Alternatives and similar repositories for caduceus:
Users that are interested in caduceus are comparing it to the libraries listed below
- Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders☆159Updated last month
- Benchmarking DNA Language Models on Biologically Meaningful Tasks☆108Updated 4 months ago
- Repository for StripedHyena, a state-of-the-art beyond Transformer architecture☆353Updated last year
- Orthrus is a mature RNA model for RNA property prediction. It uses a mamba encoder backbone, a variant of state-space models specifical…☆55Updated last month
- [NeurIPS 2024] BEACON: Benchmark for Comprehensive RNA Tasks and Language Models☆32Updated 7 months ago
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆52Updated last year
- ☆20Updated 3 weeks ago
- AI-Driven Digital Organism (AIDO) is a system of multiscale foundation models for predicting, simulating and programming biology at all l…☆67Updated 2 months ago
- Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena☆646Updated 8 months ago
- Primary RNA sequence model☆35Updated 9 months ago
- A repository with exploration into using transformers to predict DNA ↔ transcription factor binding☆84Updated 2 years ago
- Simple and Effective Masked Diffusion Language Model☆327Updated last week
- A Protein Large Language Model for Multi-Task Protein Language Processing☆166Updated 2 weeks ago
- Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax☆112Updated 3 years ago
- ☆39Updated last year
- GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics☆124Updated 6 months ago
- A collection of awesome bio-foundation models, including protein, RNA, DNA, gene, single-cell, and so on.☆202Updated 2 weeks ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆124Updated last month
- Benchmarks for classification of genomic sequences☆134Updated last year
- Repository for mRNA Paper and CodonBERT publication.☆124Updated 9 months ago
- Official Implemetation of DPLM (ICML'24) - Diffusion Language Models Are Versatile Protein Learners☆125Updated last week
- Official repository for the paper "Tranception: Protein Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval"☆148Updated last year
- ☆251Updated 11 months ago
- ProtMamba: a homology-aware but alignment-free protein state space model☆56Updated 4 months ago
- (Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307…☆50Updated last year
- [ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts☆93Updated last year
- [NeurIPS 2023] Official codes of "MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data"☆28Updated 8 months ago
- Cell2Sentence: Teaching Large Language Models the Language of Biology☆46Updated 3 months ago