baaihealth / opi
This repo is for the Open Protein Instructions (OPI) project, aiming to build and release a high-quality and comprehensive protein instruction dataset with which LLMs can be adapted to protein-related tasks via instruction tuning and evaluated on these tasks.
☆4Updated 2 weeks ago
Alternatives and similar repositories for opi:
Users that are interested in opi are comparing it to the libraries listed below
- [ACL 2024] ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training☆45Updated last year
- BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)☆111Updated 7 months ago
- Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer, NeurIPS2024☆26Updated 5 months ago
- A comprehensive repository dedicated to the collection and exploration of studies utilizing Large Language Models for molecular design, p…☆39Updated last year
- A Text-guided Protein Design Framework, Nat Mach Intell 2025 (https://www.nature.com/articles/s42256-025-01011-z)☆74Updated 3 months ago
- ☆50Updated 10 months ago
- The first large protein language model trained follows structure instructions.☆78Updated 10 months ago
- [RECOMB 2023] Official implementation of "Pisces: A combo-wise contrastive learning approach to synergistic drug combination prediction".☆14Updated last year
- Must-read papers on NLP for science.☆57Updated last year
- Exploring Evolution-aware & free protein language models as protein function predictors☆64Updated 6 months ago
- Retrieved Sequence Augmentation for Protein Representation Learning☆51Updated last year
- [ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding☆146Updated last month
- ☆38Updated last year
- PEER Benchmark, appear at NeurIPS 2022 Dataset and Benchmark Track (https://arxiv.org/abs/2206.02096)☆89Updated 2 years ago
- Code for 'On Pre-trained Language Models For Antibody'☆33Updated 2 years ago
- SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction (Briefings in Bioinformatics 2023)☆52Updated 10 months ago
- Code for "Unifying Molecular and Textual Representations via Multi-task Language Modelling" @ ICML 2023☆37Updated 7 months ago
- ☆29Updated 6 months ago
- Code and Data for the paper: Multi-level Protein Structure Pre-training with Prompt Learning [ICLR 2023]☆34Updated last year
- Associated Repository for "Translation between Molecules and Natural Language"☆174Updated last year
- MSAGPT☆30Updated 4 months ago
- PyTorch code for KDD 2023 paper "Pre-training Antibody Language Models for Antigen-Specific Computational Antibody Design"☆50Updated last year
- Code for the paper Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language☆96Updated 7 months ago
- ☆43Updated 11 months ago
- A Biological Foundation Model Bridging the Gap between Molecular Sequences Through Central Dogma☆27Updated last month
- [ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts☆95Updated last year
- Bioinformatics'2022 PerceiverCPI: A nested cross-attention network for compound-protein interaction prediction☆35Updated last year
- [ICML 2024] Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation☆23Updated 7 months ago
- Source code for "A Deep-learning System Bridging Molecule Structure and Biomedical Text with Comprehension Comparable to Human Profession…☆87Updated last year
- 🎈 Structure-aware adapter fine-tuning PLMs, with high training speed and impressive performance (Journal of Chemical Information and Mod…☆21Updated 3 months ago