soedinglab / kClust
kClust is a fast and sensitive clustering method for the clustering of protein sequences. It is able to cluster large protein databases down to 20-30% sequence identity. kClust generates a clustering where each cluster is represented by its longest sequence (representative sequence).
☆17Updated 6 years ago
Alternatives and similar repositories for kClust:
Users that are interested in kClust are comparing it to the libraries listed below
- Protein structure alignment and search algorithm☆51Updated this week
- ☆14Updated 8 years ago
- Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.☆30Updated this week
- Python framework for doing ancestral sequence reconstruction☆37Updated 7 months ago
- The 3DFI pipeline predicts the 3D structure of proteins and searches for structural homology in the 3D space.☆19Updated 11 months ago
- Automatic oligonucleotide design for PCR-based gene synthesis☆39Updated 5 years ago
- ☆11Updated 2 months ago
- Template-based RNA secondary structure visualization☆25Updated 2 months ago
- Centroid RNA package☆19Updated 4 years ago
- Transmembrane proteins predicted through Language Model embeddings☆33Updated 3 weeks ago
- Protein Sequence Annotation with Language Models☆19Updated 3 months ago
- Conservation analysis of homologous proteins with Python☆10Updated 3 years ago
- Software for predicting translation initiation rates in bacteria☆20Updated 2 months ago
- DeepSig - Predictor of signal peptides in proteins based on deep learning☆26Updated last year
- Code for LazyAF pipeline☆21Updated last year
- Fast protein domain structure embedding+search tool☆12Updated last month
- Calculates pairwise sequence identity, similarity and normalized similarity score of proteins in a multiple sequence alignment.☆14Updated last year
- Clustering the NCBI nr database with mmseq2 (90% length, 90% identity). Inspired by the NCBI's experimental ClusteredNR database.☆23Updated last year
- RNA structure probing and post-transcriptional modifications mapping high-throughput data analysis☆36Updated this week
- Visualise RNA secondary structure in consistent, reproducible and recognisable layouts☆68Updated this week
- MSA(Multiple Sequence Alignment) visualization python package for sequence analysis☆115Updated 2 months ago
- UniProt Id Mapping through API☆31Updated 4 months ago
- CLANS_2 is a Python-based program for clustering sequences in the 2D or 3D space, based on their sequence similarities. CLANS visualizes …☆18Updated 2 months ago
- Discovery of conserved gene clusters in multiple genomes☆59Updated last week
- A python framework for microbial natural products data mining by integrating genomics and metabolomics data☆18Updated last week
- Evolutionary conservation estimation of residues or nucleotides☆34Updated 2 years ago
- Protein structure comparison tools such as SSAP and SNAP☆63Updated last year
- A PCR primer tool for DNA assembly flows☆30Updated 9 months ago
- Python Implementation of Codon Adaption Index☆36Updated 2 years ago
- A quick and easy way to download the genomes/predicted proteins of taxa available in JGI's Genome Portal.☆32Updated 5 months ago