victorskl / genomic-bigdata-spark
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
☆11Updated last year
Related projects: ⓘ
- Accelerated genomics workflows in the Workflow Description Language☆27Updated 5 months ago
- Genome-wide association studies identify genetic variations associated with a target disease or trait. Researchers and clinicians can use…☆11Updated 5 months ago
- Very large scale k-mer counting and analysis on Apache Spark.☆17Updated 7 months ago
- secondary analysis pipelines parallelized with apache spark☆15Updated 2 years ago
- Tool for finding matches to degenerate sequence motifs in FASTA files.☆12Updated 6 months ago
- Implementation of LSTM for detecting regions of Neanderthal introgression in modern human genomes☆9Updated 4 years ago
- deploy a snakemake pipeline directly from version control (under development)☆19Updated 4 months ago
- Viral Identification and Discovery - A viral characterization pipeline built in Nextflow.☆11Updated 4 years ago
- DeepNovo workflow of neoantigen discovery by personalized de novo sequencing.☆10Updated 3 years ago
- Map your disease and phenotype terms to the Open Targets platform ontology☆20Updated last month
- Pandas ExtensionDtypes for dealing with genomics data☆47Updated last year
- ☆13Updated 3 years ago
- CNN based classifier for detecting viral sequences among metagenomic contigs☆30Updated 4 years ago
- Repository for development of the genomic module of the CDM.☆19Updated 5 years ago
- Unified repository for the GA4GH Beacon v2 API standard☆23Updated last week
- This repository contains all the source files required to run DeLUCS, a deep learning clustering algorithm for DNA sequences.☆24Updated 2 years ago
- Applied Statistics for High-Throughput Biology☆16Updated last month
- Linter rules for Nextflow DSL scripts☆28Updated last week
- A very simple BLAST filtering pipeline☆18Updated 10 years ago
- Namespace encoding hierarchical relationships between proteins, protein families, and protein complexes.☆12Updated 3 years ago
- Listing of GPU based bioinformatics software & sites & publications☆10Updated 2 years ago
- Intel lab's open sourced data science framework for accelerating digital biology☆36Updated 2 weeks ago
- Deep learning library for biological sequences. Extension of Fastai and Pytorch.☆40Updated last month
- ☆32Updated this week
- WDL tools for parsing, type-checking, and more☆23Updated last month
- An option to spin cost effective EMR clusters in AWS with Hail and JupyterNotebook installed☆16Updated 4 years ago
- A provenance library for bioinformatics workflows 🧬 🔀 📝☆13Updated 2 years ago
- A repository for the GenGraph toolkit for the creation and manipulation of graph genomes☆51Updated 3 years ago
- Multifactorial modeling of response to checkpoint inhibitor immunotherapy from tumor, immune, and clinical features☆14Updated 6 years ago
- Parallel Genomic Analysis Toolkit☆14Updated 5 years ago