soedinglab / kClust
kClust is a fast and sensitive clustering method for the clustering of protein sequences. It is able to cluster large protein databases down to 20-30% sequence identity. kClust generates a clustering where each cluster is represented by its longest sequence (representative sequence).
☆18Updated 6 years ago
Alternatives and similar repositories for kClust:
Users that are interested in kClust are comparing it to the libraries listed below
- ☆14Updated 8 years ago
- Protein structure alignment and search algorithm☆59Updated last month
- ☆11Updated 4 months ago
- Conservation analysis of homologous proteins with Python☆10Updated 3 years ago
- Discovery of conserved gene clusters in multiple genomes☆79Updated last month
- Python framework for doing ancestral sequence reconstruction☆38Updated 9 months ago
- Protein Sequence Annotation with Language Models☆20Updated 6 months ago
- Visualise RNA secondary structure in consistent, reproducible and recognisable layouts☆72Updated this week
- Python Implementation of Codon Adaption Index☆37Updated 2 years ago
- A python framework for microbial natural products data mining by integrating genomics and metabolomics data☆18Updated 3 weeks ago
- A quick and easy way to download the genomes/predicted proteins of taxa available in JGI's Genome Portal.☆35Updated 8 months ago
- Software for predicting translation initiation rates in bacteria☆22Updated 5 months ago
- Clustering the NCBI nr database with mmseq2 (90% length, 90% identity). Inspired by the NCBI's experimental ClusteredNR database.☆23Updated last year
- The 3DFI pipeline predicts the 3D structure of proteins and searches for structural homology in the 3D space.☆19Updated last year
- Calculates pairwise sequence identity, similarity and normalized similarity score of proteins in a multiple sequence alignment.☆14Updated last year
- Transmembrane proteins predicted through Language Model embeddings☆35Updated 3 months ago
- Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.☆31Updated 2 months ago
- scripts for predicting natural product activity from biosynthetic gene cluster sequences☆23Updated last week
- UniProt Id Mapping through API☆31Updated 7 months ago
- A flexible and modular software suite for domain-based gene neighborhood and protein search, extraction, and clustering.☆20Updated last month
- DeepSig - Predictor of signal peptides in proteins based on deep learning☆26Updated 2 years ago
- Nanopore UMI-linked consensus sequencing☆15Updated 4 years ago
- ☆24Updated last year
- Automatic oligonucleotide design for PCR-based gene synthesis☆44Updated 5 years ago
- DeepECtransformer☆24Updated last year
- A domain parser for Alphafold models☆35Updated last year
- Fast protein domain structure embedding+search tool☆18Updated last month
- ☆17Updated 4 years ago
- SMBGC Annotation using Neural Networks Trained on Interpro Signatures☆27Updated 3 weeks ago
- Code for LazyAF pipeline☆20Updated last year