molpopgen / BigDataFormats
Tutorial on programming or "big data"/bioinformatics
☆33Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for BigDataFormats
- Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.☆21Updated 2 months ago
- ☆14Updated 5 years ago
- ☆12Updated 8 years ago
- A fast and space-efficient pre-filter for estimating the quantification of very large collections of nucleotide sequences☆14Updated this week
- A Python library to work with high-throughput sequencing data in the context of data integration☆14Updated 7 years ago
- qtools has helper functions to submit jobs to compute clusters (PBS on TSCC, SGE on oolite) from within Python☆21Updated last year
- Set of tools for viral metagenomics.☆14Updated 11 months ago
- Online material and code base for the article Coordinates and Intervals in Graph Based Reference Genomes☆11Updated 7 years ago
- Write-once-read-many table for large datasets.☆28Updated last year
- This is a short response to the 2018 RFI on NIH Strategic Plan for Data Science☆16Updated 6 years ago
- Visualise interstrain recombination from environmental samples.☆26Updated 5 years ago
- Find nodes in hierarchical clustering that are statistically significant☆28Updated 7 years ago
- Inferring spatiotemporal dynamics of the H1N1 influenza pandemic from sequence data☆32Updated 11 years ago
- Qtip: a tandem simulation approach for accurately predicting read alignment mapping qualities☆25Updated 5 years ago
- Generate kmers/minimizers/hashes/MinHash signatures, including with multiple kmer sizes.☆24Updated 3 years ago
- Flexible omics pipeline☆18Updated 4 months ago
- Class materials for the NIH HPC snakemake class☆16Updated last month
- python stuff I use☆19Updated 4 years ago
- Hail: extract lines from a file, a la `head -n x | tail -n y`☆8Updated 4 years ago
- Normalization and difference calling for Next Generation Sequencing (NGS) data via joint multinomial modeling.☆11Updated 3 years ago
- A library for GNU make to schedule rules as jobs with qsub or sbatch☆11Updated 11 years ago
- A Python library for fast, thread-safe computations on phylogenetic trees☆27Updated this week
- logistic factor analysis☆16Updated 8 months ago
- Discoverability for gene search☆12Updated last year
- Onboarding materials for the Greene Lab☆31Updated last month
- Parallel Recipes : parallel workflow execution made easy☆13Updated 9 years ago
- A pipeline for making SWIft Genomes in a Graph (SWIGG) using k-mers☆21Updated 4 years ago
- Streaming algorithm for computing kmer statistics for massive genomics datasets☆53Updated 4 years ago