Randomly sample lines from massive text files efficiently
☆17Apr 1, 2015Updated 10 years ago
Alternatives and similar repositories for subsample
Users that are interested in subsample are comparing it to the libraries listed below
Sorting:
- Statistical mixed effects models in Ruby☆21Jul 8, 2016Updated 9 years ago
- a simple read-only sequence database, designed for short reads☆20Dec 19, 2016Updated 9 years ago
- A plotting library in Ruby built on top of Vega and D3.☆43Jun 22, 2025Updated 8 months ago
- Graph Theory library for Ruby☆48Oct 23, 2019Updated 6 years ago
- A CUDA Library for Parallel n-body Integrations with focus on Simulations☆17Jul 2, 2014Updated 11 years ago
- Software for the 2015 Waterloo iGEM Team☆11Oct 19, 2015Updated 10 years ago
- Tool for finding matches to degenerate sequence motifs in FASTA files.☆13Mar 11, 2024Updated last year
- Multiple Bacteria Genome Compressor (MBGC)☆11Feb 20, 2026Updated last week
- Provide access to PostgreSQL's sequences☆10Oct 25, 2024Updated last year
- Course materials, including syllabus, datasets, scripts, lectures, etc☆13Apr 21, 2017Updated 8 years ago
- ☆11Aug 17, 2014Updated 11 years ago
- This project contains simple methods to measure sample relatedness and identify potential swaps and contamination☆10Jan 8, 2016Updated 10 years ago
- Spanish text summarization demo using CoreNLP☆10Sep 13, 2014Updated 11 years ago
- Compute strain abundance in a defined microbial community☆10Jul 27, 2023Updated 2 years ago
- Interactive Python development in Neovim with cell-based execution, IPython kernels, and rich media output in your browser☆22Dec 9, 2025Updated 2 months ago
- Introduction to Nanopore Sequencing - Prac material and presentations from The Omics Australia Tutorials Sydney (TOAST) workshop 19th - 2…☆10Jan 12, 2019Updated 7 years ago
- Parses Facebook chat messages into Python objects to enable convenient analysis.☆11Jan 3, 2018Updated 8 years ago
- A dotplot application for DNA/RNA sequence☆11Nov 28, 2022Updated 3 years ago
- key value database with transactional capabilities. Created for a Distributed Systems class, not suitable for production☆15Dec 5, 2018Updated 7 years ago
- ☆11Sep 16, 2016Updated 9 years ago
- Fast Fuzzy String matching dictionary for Scala☆10Mar 20, 2015Updated 10 years ago
- Predict prokaryotic hosts for phage (meta) genomic sequences☆11Apr 4, 2022Updated 3 years ago
- Documentation for NCBI BLAST AMI☆11Feb 28, 2022Updated 4 years ago
- Size-Wise Analysis of Transfer Learning of pLM Embeddings☆12Jun 10, 2025Updated 8 months ago
- ☆11Mar 10, 2024Updated last year
- vaginal microbiota☆12Jan 21, 2025Updated last year
- Namespace encoding hierarchical relationships between proteins, protein families, and protein complexes.☆12Mar 9, 2021Updated 4 years ago
- Jasmine "lnishan" Chen's Curriculum Vitae (CV) in Markdown☆10May 23, 2018Updated 7 years ago
- Movie recommendation using apache spark☆10Apr 23, 2017Updated 8 years ago
- Isomorphic flux and react blog☆10May 20, 2016Updated 9 years ago
- file command's magic pattern file for bioinformatics☆21Nov 25, 2015Updated 10 years ago
- Rapidly extract reads from a FASTQ file based on taxonomic classification via Kraken2.☆13Jan 26, 2026Updated last month
- Various Algorithms written in ES6☆11May 30, 2015Updated 10 years ago
- the config server and agent for configuration items.☆12Nov 2, 2025Updated 3 months ago
- Threadpool library☆23May 31, 2015Updated 10 years ago
- taxonomic classes for Python☆11Aug 30, 2021Updated 4 years ago
- A flexible python program for generating figures from regions of the genome.☆13Apr 6, 2019Updated 6 years ago
- Map query sequences to the assemblies of all pre-June 2023 bacteria (https://ftp.ebi.ac.uk/pub/databases/AllTheBacteria/Releases/0.2/) on…☆12May 22, 2024Updated last year
- Trying out PyTorch because the hype is real☆10Oct 18, 2017Updated 8 years ago