Multi-task and masked language model-based protein sequence embedding models.
☆106Jun 16, 2021Updated 4 years ago
Alternatives and similar repositories for prose
Users that are interested in prose are comparing it to the libraries listed below
Sorting:
- Source code for "Learning protein sequence embeddings using information from structure" - ICLR 2019☆262Jun 16, 2021Updated 4 years ago
- Get protein embeddings from protein sequences☆506Apr 28, 2023Updated 2 years ago
- Repository for publicly available deep learning models developed in Rosetta community☆123Sep 18, 2021Updated 4 years ago
- ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit a…☆1,291May 22, 2025Updated 9 months ago
- Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different dom…☆732Dec 11, 2022Updated 3 years ago
- Embedding-based annotation transfer (EAT) uses Euclidean distance between vector representations (embeddings) of proteins to transfer ann…☆41Aug 29, 2025Updated 6 months ago
- A collection of tasks to probe the effectiveness of protein sequence representations in modeling aspects of protein design☆111Sep 30, 2024Updated last year
- DistilProtBert implementation, a distilled version of ProtBert model.☆16Sep 21, 2022Updated 3 years ago
- Homology reduced UniProt, train-/valid-/testsets for language modeling☆16Apr 20, 2022Updated 3 years ago
- pretrained LookingGlass language model for biological read-length DNA sequences, and related models derived from transfer learning☆15Feb 19, 2026Updated last week
- Evolutionary velocity with protein language models☆97Dec 9, 2025Updated 2 months ago
- Codebase for our preprint using trRosetta to design proteins with discontinuous functional sites, found here: https://www.biorxiv.org/con…☆16Oct 27, 2021Updated 4 years ago
- Unsupervised neural network for learning embeddings of GO terms.☆20Feb 19, 2022Updated 4 years ago
- Official code repository of "BERTology Meets Biology: Interpreting Attention in Protein Language Models."☆305May 1, 2025Updated 10 months ago
- Implementation of Protein Classification based on subcellular localization using ProtBert(Rostlab/prot_bert_bfd_localization) model from …☆42May 3, 2024Updated last year
- Prediction of binding residues for metal ions, nucleic acids, and small molecules.☆34Sep 2, 2025Updated 5 months ago
- ☆110Mar 7, 2022Updated 3 years ago
- Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix☆35Jul 28, 2022Updated 3 years ago
- An all-atom protein structure dataset for machine learning.☆359Mar 16, 2024Updated last year
- ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit a…☆11Jun 2, 2022Updated 3 years ago
- My work on building a deep neural network for fast and accurate protein protein interaction prediction☆11Mar 13, 2024Updated last year
- repDNA is a Python package to generate various features of DNA sequences incorporating physicochemical properties and sequence-order effe…☆13Jul 16, 2022Updated 3 years ago
- Interpretation by Deep Generative Masking for Biological Sequences☆37Dec 9, 2021Updated 4 years ago
- ☆255Jul 31, 2024Updated last year
- Primary RNA sequence model☆42May 20, 2024Updated last year
- Code for reproducing results of "Unsupervised embeddings is all you need for protein function prediction"☆41Jul 6, 2023Updated 2 years ago
- Official repository for the paper "Tranception: Protein Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval"☆162Aug 24, 2023Updated 2 years ago
- ☆30May 15, 2025Updated 9 months ago
- open source repository☆146Nov 30, 2023Updated 2 years ago
- Language modeling of viral evolution☆149Mar 24, 2023Updated 2 years ago
- Official Pytorch implementation of PLUS (Protein sequence representations Learned Using Structural information), IEEE Access 2021☆39Sep 5, 2023Updated 2 years ago
- Contrastive fitness learning: Reprogramming protein language models for low-N learning of protein fitness landscape☆35Dec 8, 2025Updated 2 months ago
- PaccMann models for protein language modeling☆43Nov 10, 2021Updated 4 years ago
- RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Ma…☆98Jan 24, 2023Updated 3 years ago
- A compilation of deep learning methods for protein design☆97Nov 5, 2022Updated 3 years ago
- ☆17Dec 10, 2022Updated 3 years ago
- Deep learning based alignment-free method for protein family modeling and prediction☆16Jul 31, 2018Updated 7 years ago
- High-throughput framework for running molecular simulations for METL☆23Sep 18, 2025Updated 5 months ago
- iFeature is a comprehensive Python-based toolkit for generating various numerical feature representation schemes from protein or peptide …☆195May 23, 2022Updated 3 years ago