aehrc / VariantSpark
machine learning for genomic variants
☆140Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for VariantSpark
- High performance data storage for importing, querying and transforming variants.☆94Updated last week
- A scalable genome browser. Apache 2 licensed.☆125Updated last year
- Distributed execution of bioinformatics tools on Apache Spark. Apache 2 licensed.☆39Updated 6 months ago
- Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework☆69Updated last year
- GenomicsDB☆111Updated last year
- Extensible specification for representing and uniquely identifying biological sequence variation☆80Updated this week
- GA4GH Variation Representation Python Implementation☆51Updated this week
- This repo provides tools to convert ClinVar data into a tab-delimited flat file, and also provides that resulting tab-delimited flat file…☆122Updated 4 years ago
- Tibanna helps you run your genomic pipelines on Amazon cloud (AWS). It is used by the 4DN DCIC (4D Nucleome Data Coordination and Integr…☆70Updated 3 months ago
- Reference implementation of the APIs defined in ga4gh-schemas. RETIRED 2018-01-24☆96Updated 6 years ago
- ☆174Updated last year
- Efficient variant-call data storage and retrieval library using the TileDB storage library.☆88Updated this week
- SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the mos…☆69Updated 5 years ago
- The Pharmacogenomic Clinical Annotation Tool☆120Updated this week
- Source code and related materials for the O'Reilly book☆92Updated 2 years ago
- Scripts for working with Google Cloud Dataproc service☆37Updated 5 years ago
- An opinionated Cromwell orchestration manager.☆40Updated last year
- Browser for ExAC consortium data☆106Updated 2 years ago
- Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline☆57Updated 4 years ago
- High-Performance NoSQL database and RESTful web services to access to most relevant biological data☆89Updated 2 weeks ago
- MuTect -- Accurate and sensitive cancer mutation detection☆94Updated last year
- Various algorithms for analysing genomics data☆194Updated this week
- An option to spin cost effective EMR clusters in AWS with Hail and JupyterNotebook installed☆16Updated 4 years ago
- Repository for the GA4GH Benchmarking Team work developing standardized benchmarking methods for germline small variant calls☆188Updated 3 years ago
- An Open Computational Genomics Analysis platform for big data genomics analysis. OpenCGA is maintained and develop by its parent company …☆166Updated this week
- De novo assembly based variant calling pipeline for Illumina short reads☆107Updated 3 years ago
- Workflow Description Language compiler for the DNAnexus platform☆40Updated last year
- ☆81Updated 5 years ago
- Workflows for processing high-throughput sequencing data for variant discovery with GATK4 and related tools☆148Updated 2 years ago
- Workflows for germline short variant discovery with GATK4☆133Updated 3 years ago