victorskl / genomic-bigdata-spark
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
☆11Updated 2 years ago
Alternatives and similar repositories for genomic-bigdata-spark:
Users that are interested in genomic-bigdata-spark are comparing it to the libraries listed below
- VCF Observer is a VCF file analysis, comparison, and visualization tool.☆15Updated 2 weeks ago
- Tool for finding matches to degenerate sequence motifs in FASTA files.☆12Updated 10 months ago
- Scanomatic☆10Updated last year
- Accelerated genomics workflows in the Workflow Description Language☆30Updated 9 months ago
- Pipeline for the identification of (coding) gene structures in draft genomes.☆25Updated 8 months ago
- A provenance library for bioinformatics workflows 🧬 🔀 📝☆14Updated 3 years ago
- Galaxy on AWS Guidance provides all the infrastructure components required to run Galaxy in the cloud and are preconfigured with industry…☆15Updated 3 months ago
- Forensic analysis tool useful in backwards computing information from next-generation sequencing data.☆11Updated this week
- Intel lab's open sourced data science framework for accelerating digital biology☆42Updated this week
- Functional enrichment terms aggregator☆18Updated 9 months ago
- Namespace encoding hierarchical relationships between proteins, protein families, and protein complexes.☆12Updated 3 years ago
- Viral Identification and Discovery - A viral characterization pipeline built in Nextflow.☆11Updated 4 years ago
- Job Manager API and UI for interacting with asynchronous batch jobs and workflows.☆26Updated last week
- Experimental plugin to integrate GPT like prompt into Nextflow☆15Updated 9 months ago
- 3D Genome Browser☆31Updated 2 years ago
- An option to spin cost effective EMR clusters in AWS with Hail and JupyterNotebook installed☆16Updated 4 years ago
- jinja2-enabled jupyter notebooks☆35Updated 5 months ago
- Repository for development of the genomic module of the CDM.☆19Updated 5 years ago
- UNDER CONSTRUCTION: A pipeline for Genome Wide Association Studies☆24Updated 3 weeks ago
- toolkit for file system virtualisation of random access compressed FASTA, FAI, DICT & TWOBIT files☆22Updated 5 months ago
- ☆11Updated 3 years ago
- deploy a snakemake pipeline directly from version control (under development)☆21Updated last month
- Standard for describing and searching biomedical data developed by the Global Alliance for Genomics & Health.☆24Updated last year
- Library for visualising genomic features in Python.☆15Updated 7 years ago
- Tools for developing and running pipelines with the Genomics API☆24Updated 5 years ago
- Semantic Search☆32Updated this week
- CPU and GPU deterministic and therefore fully reproducible machine learning pipelines using MLflow.☆46Updated last year
- Tokenizers and Machine Learning Models for biological sequence data☆25Updated 3 months ago