baaihealth / opi
This repo is for the Open Protein Instructions (OPI) project, aiming to build and release a high-quality and comprehensive protein instruction dataset with which LLMs can be adapted to protein-related tasks via instruction tuning and evaluated on these tasks.
☆3Updated last month
Alternatives and similar repositories for opi:
Users that are interested in opi are comparing it to the libraries listed below
- A comprehensive repository dedicated to the collection and exploration of studies utilizing Large Language Models for molecular design, p…☆40Updated last year
- A Text-guided Protein Design Framework, Nat Mach Intell 2025☆51Updated last week
- BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)☆101Updated 4 months ago
- Exploring Evolution-aware & free protein language models as protein function predictors☆61Updated 3 months ago
- [ACL 2024] ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training☆39Updated 10 months ago
- MSAGPT☆25Updated last month
- [ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts☆90Updated last year
- diffusion model for protein sequence generation☆47Updated last year
- ☆33Updated 10 months ago
- [ICLR 2024] Domain-Agnostic Molecular Generation with Chemical Feedback☆144Updated last month
- ☆49Updated 7 months ago
- The first large protein language model trained follows structure instructions.☆73Updated 7 months ago
- Code for "Unifying Molecular and Textual Representations via Multi-task Language Modelling" @ ICML 2023☆36Updated 4 months ago
- Code for 'On Pre-trained Language Models For Antibody'☆32Updated last year
- Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer, NeurIPS2024☆22Updated 2 months ago
- Code for ProSST: A Pre-trained Protein Sequence and Structure Transformer with Disentangled Attention.☆72Updated last week
- PEER Benchmark, appear at NeurIPS 2022 Dataset and Benchmark Track (https://arxiv.org/abs/2206.02096)☆84Updated last year
- Must-read papers on NLP for science.☆58Updated last year
- PyTorch code for KDD 2023 paper "Pre-training Antibody Language Models for Antigen-Specific Computational Antibody Design"☆50Updated last year
- ☆39Updated 2 years ago
- [ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding☆146Updated last year
- Implementation for ICML 2024 paper "MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space"☆82Updated 3 months ago
- Bib'23: Improved the Heterodimer Protein Complex Prediction with Protein Language Models☆14Updated last year
- ☆62Updated 2 years ago
- [RECOMB 2023] Official implementation of "Pisces: A combo-wise contrastive learning approach to synergistic drug combination prediction".☆14Updated last year
- Source code of PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications.☆31Updated last week
- ESM-GearNet for Protein Structure Representation Learning (https://arxiv.org/abs/2303.06275)☆81Updated last year
- ☆11Updated 3 years ago
- A Biological Foundation Model Bridging the Gap between Molecular Sequences Through Central Dogma☆21Updated 2 months ago