soedinglab / kClustLinks
kClust is a fast and sensitive clustering method for the clustering of protein sequences. It is able to cluster large protein databases down to 20-30% sequence identity. kClust generates a clustering where each cluster is represented by its longest sequence (representative sequence).
☆18Updated 6 years ago
Alternatives and similar repositories for kClust
Users that are interested in kClust are comparing it to the libraries listed below
Sorting:
- ☆14Updated 8 years ago
- Protein structure alignment and search algorithm☆62Updated last week
- Software for predicting translation initiation rates in bacteria☆26Updated 7 months ago
- Protein Sequence Annotation with Language Models☆22Updated last month
- Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.☆32Updated 2 weeks ago
- Python Implementation of Codon Adaption Index☆37Updated 2 years ago
- ☆11Updated 6 months ago
- DeepSig - Predictor of signal peptides in proteins based on deep learning☆26Updated 2 years ago
- A quick and easy way to download the genomes/predicted proteins of taxa available in JGI's Genome Portal.☆37Updated 3 weeks ago
- Fast protein domain structure embedding+search tool☆21Updated last month
- Calculates pairwise sequence identity, similarity and normalized similarity score of proteins in a multiple sequence alignment.☆17Updated last year
- A domain parser for Alphafold models☆38Updated last year
- Python framework for doing ancestral sequence reconstruction☆38Updated 11 months ago
- Evolutionary conservation estimation of residues or nucleotides☆44Updated 3 years ago
- Conservation analysis of homologous proteins with Python☆12Updated 3 years ago
- Discovery of conserved gene clusters in multiple genomes☆81Updated last month
- Automatic oligonucleotide design for PCR-based gene synthesis☆46Updated 5 years ago
- UniProt Id Mapping through API☆34Updated 8 months ago
- Detection of remote homology by comparison of protein language model representations☆54Updated 6 months ago
- Template-based RNA secondary structure visualization☆26Updated 7 months ago
- The 3DFI pipeline predicts the 3D structure of proteins and searches for structural homology in the 3D space.☆19Updated last year
- Code for LazyAF pipeline☆20Updated last year
- Clustering the NCBI nr database with mmseq2 (90% length, 90% identity). Inspired by the NCBI's experimental ClusteredNR database.☆23Updated 2 years ago
- Universal and efficient core gene phylogeny with Foldseek and ProstT5☆68Updated last week
- Visualise RNA secondary structure in consistent, reproducible and recognisable layouts☆73Updated 2 weeks ago
- Transmembrane proteins predicted through Language Model embeddings☆37Updated last month
- Deep learning embedding for nucleotide sequences☆18Updated 3 months ago
- A machine learning model for the prediction of optimal growth temperature of microorganisms and enzyme catalytic optima☆57Updated 4 years ago
- Untargeted metabolomics workflow for large-scale data processing and analysis implemented in Snakemake☆26Updated 5 months ago
- CLANS_2 is a Python-based program for clustering sequences in the 2D or 3D space, based on their sequence similarities. CLANS visualizes …☆19Updated 6 months ago