alexpreynolds / sample
Performs memory-efficient reservoir sampling on very large input files delimited by newlines
☆69Updated 4 years ago
Alternatives and similar repositories for sample:
Users that are interested in sample are comparing it to the libraries listed below
- utilities for indexing and sequence extraction from FASTA files☆59Updated 3 years ago
- Fast and memory-efficient sequencing error corrector☆92Updated 9 months ago
- Streaming algorithm for computing kmer statistics for massive genomics datasets☆53Updated 4 years ago
- Fast calculations of linkage-disequilibrium in large-scale human cohorts☆42Updated 5 years ago
- normalize, left-align, trim, validate and clean VCF files☆20Updated 9 years ago
- Efficient handling of FASTQ files from Python☆50Updated 4 months ago
- Code accompanying the publication for compressed graph annotation☆13Updated 5 years ago
- Squeakr: An Exact and Approximate k -mer Counting System☆85Updated 11 months ago
- An alignment-free, reference-free and incremental data structure for colored de Bruijn graph with application to pan-genome indexing.☆43Updated 3 years ago
- a wee tool for random access into BGZF files.☆84Updated 6 years ago
- Implicit Interval Tree with Interpolation Index☆41Updated 2 years ago
- Load numpy arrays and HDF5 files from VCF (variant call format)☆31Updated 7 years ago
- Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.☆21Updated last month
- Enhanced Artificial Genome Engine: next generation sequencing reads simulator☆32Updated 4 years ago
- Isaac Genome Alignment Software☆37Updated 9 years ago
- GenomicsDB☆111Updated 2 years ago
- Cosmo is a fast, low-memory DNA assembler using a Succinct (variable order) de Bruijn Graph.☆51Updated 10 months ago
- Standalone C library for assembling Illumina short reads in small regions☆72Updated 2 years ago
- A software for the multispecies design of CRISPR/Cas9 libraries☆34Updated 2 years ago
- Flexible genotype query among 30,000+ samples whole-genome☆96Updated 5 years ago
- Streaming relation (overlap, distance, KNN) of (any number of) sorted genomic interval sets. #golang☆47Updated 4 years ago
- efficient alignment of strings to partially ordered string graphs☆33Updated 3 years ago
- Fast spliced aligner with low memory requirements☆41Updated 9 years ago
- A fast Python library for VCF files leveraging Cython for speed.☆52Updated 6 years ago
- Genotype and phase short tandem repeats using Illumina whole-genome sequencing data☆95Updated last year
- SVG based genome viewer written in javascript using D3☆33Updated 9 years ago
- MinHash Alignment Process (MHAP, pronounced MAP): locality-sensitive hashing to detect long-read overlaps and utilities☆96Updated 2 years ago
- Software for exploration of gene expression data from single-cell RNA sequencing.☆28Updated 5 years ago
- Sparse Project VCF: evolution of VCF to encode population genotype matrices efficiently☆58Updated last year
- Implementation of Positional Burrows-Wheeler Transform for genetic data☆102Updated this week