lifeomic / spark-vcfLinks
Spark VCF data source implementation for Dataframes
☆14Updated 3 years ago
Alternatives and similar repositories for spark-vcf
Users that are interested in spark-vcf are comparing it to the libraries listed below
Sorting:
- Easily run WDL workflows on GCP☆13Updated 3 years ago
- qtools has helper functions to submit jobs to compute clusters (PBS on TSCC, SGE on oolite) from within Python☆21Updated last year
- A genomics pipeline build on top of the GATK Queue framework. Main repository: https://github.com/NationalGenomicsInfrastructure/piper (m…☆21Updated 8 years ago
- Integrative visualization of multiple omic datasets onto KEGG pathways.☆11Updated 3 years ago
- Simple and efficient access to genomic data for deep learning models.☆42Updated 5 years ago
- Utilities for analyzing mutations and neoepitopes in patient cohorts☆20Updated 7 years ago
- ☆11Updated 2 years ago
- example singularity definition files and demos☆27Updated 7 years ago
- A web app for exploratory data analysis of high-throughput screens (HTS)☆17Updated 6 years ago
- A library for manipulating bioinformatics sequencing formats in Apache Spark☆32Updated 5 months ago
- R package with netDx software and data for examples☆12Updated 2 years ago
- python script to programmatically enrich your data using Enrichr API☆12Updated 8 years ago
- R function to plot high quality, elegant heatmap using 'ggplot2' graphics . Some of the important features of this package are, colorin…☆11Updated 9 years ago
- Finding a scalable alternative to the VCF File for genomics analysis☆14Updated 8 years ago
- MuSiCa - Mutational Signatures in Cancer☆23Updated last year
- A Python library to work with high-throughput sequencing data in the context of data integration☆14Updated 8 years ago
- ALPACA is a caller for genomic variants (single nucleotide and small indels) from next-generation sequencing data that uses a novel algeb…☆23Updated 8 months ago
- Reproducible Workflows, curated at the Fred Hutch☆12Updated 6 years ago
- Python client for GA4GH htsget protocol☆15Updated 2 years ago
- Examples using R and 1000 genomes data☆28Updated 4 years ago
- Terraform template to create AWS resources to execute jobs using nextflow☆22Updated 2 years ago
- [DEPRECATED] An R package for Google Genomics API queries.☆45Updated 2 years ago
- ☆11Updated 2 years ago
- Interactive demonstration of how to use PCA, t-SNE, and UMAP on genotype data from the Thousand Genome Project.☆20Updated 4 years ago
- Scripts for generating integrated dataset from publically available data sources to evaluate known genetic associations evidence of varyi…☆11Updated 5 years ago
- Class materials for the NIH HPC snakemake class☆15Updated 10 months ago
- ☆21Updated 2 years ago
- Core consonance utilities for scheduling, reporting on, and provisioning VMs for workflows☆14Updated 7 years ago
- jinja2-enabled jupyter notebooks☆37Updated last week
- TheSparkBox is an all-in-one Spark deployment that you can use to fire up a local cluster.☆12Updated 7 years ago