victorskl / genomic-bigdata-sparkLinks
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
☆11Updated 2 years ago
Alternatives and similar repositories for genomic-bigdata-spark
Users that are interested in genomic-bigdata-spark are comparing it to the libraries listed below
Sorting:
- Namespace encoding hierarchical relationships between proteins, protein families, and protein complexes.☆12Updated 4 years ago
- An option to spin cost effective EMR clusters in AWS with Hail and JupyterNotebook installed☆16Updated 5 years ago
- Scanomatic☆10Updated last year
- This is a repo for migration of CROssBAR data to the Neo4j database via BioCypher☆9Updated 3 months ago
- Tool for finding matches to degenerate sequence motifs in FASTA files.☆13Updated last year
- VCF Observer is a VCF file analysis, comparison, and visualization tool.☆17Updated 6 months ago
- Standard for describing and searching biomedical data developed by the Global Alliance for Genomics & Health.☆24Updated last year
- Accelerated genomics workflows in the Workflow Description Language☆33Updated last year
- An app and library for building, conversion, and validation of GA4GH Phenopackets.☆16Updated 2 weeks ago
- Semantic Search☆33Updated this week
- The project proposal template for OpenBioML community projects.☆18Updated 2 years ago
- Effect of tokenization on transformers for biological sequence☆18Updated last year
- A parallel API crawler for the retrieval of Kyoto Encyclopedia of Genes and Genomes metabolic and genomics data.☆21Updated last year
- NEAT (NExt-generation Analysis Toolkit) simulates next-gen sequencing reads and can learn simulation parameters from real data.☆59Updated this week
- MOVIS: A Multi-Omics Software Solution for Multi-modal Time-Series Clustering, Embedding, and Visualizing Tasks, by Aleksandar Anžel, Dom…☆10Updated 3 years ago
- Tokenizers and Machine Learning Models for biological sequence data☆25Updated 9 months ago
- Proof of concept code from Gretel.ai and Illumina using generative neural networks to create synthetic versions of mouse genotype and phe…☆35Updated 3 years ago
- Feature Annotation Location Description Ontology☆34Updated 5 years ago
- The advanced implementation for BioChatter, using Next.js☆14Updated 5 months ago
- GECO (Gene Expression Clustering Optimization; theGECOapp.com) is a minimalistic GUI app that utilizes non-linear reduction techniques to…☆9Updated 2 years ago
- ☆12Updated last year
- A bioinformatics API to interface with public multi-omics bio databases for wicked fast data integration.☆33Updated last year
- jinja2-enabled jupyter notebooks☆37Updated 2 weeks ago
- A provenance library for bioinformatics workflows 🧬 🔀 📝☆14Updated 3 years ago
- This repository contains all the source files required to run DeLUCS, a deep learning clustering algorithm for DNA sequences.☆25Updated 2 years ago
- This guidance creates a scalable environment in AWS to prepare genomic, clinical, mutation, expression and imaging data for large-scale a…☆24Updated 2 weeks ago
- The Complete Python Antibody Library☆23Updated 2 months ago
- WebApp for DNA variants interpretation☆13Updated 3 weeks ago
- Forensic analysis tool useful in backwards computing information from next-generation sequencing data.☆11Updated 2 weeks ago
- Accelerated genomics workflows in NextFlow☆36Updated 10 months ago