Big Data project using Hadoop (MapReduce, spark, Hive)
☆32Dec 10, 2019Updated 6 years ago
Alternatives and similar repositories for NYYellowTaxiProject
Users that are interested in NYYellowTaxiProject are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Project from the CTU Big Data course which purpose was to compute tf-idf values for the czech wikipedia☆10Jul 8, 2014Updated 11 years ago
- Code repository for Python for Beginners: Learn Python from Scratch, published by Packt☆16Oct 16, 2023Updated 2 years ago
- Big data projects implemented by Maniram yadav☆50May 5, 2018Updated 8 years ago
- Final Project for Data Engineering Zoomcamp Course 2024 🧙🔥☆11Apr 17, 2024Updated 2 years ago
- ☆11May 27, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Starter project with full stack BigQuery. Allows to overcome customisation restrictions imposed by pre-built dashboards and control data …☆49Jan 24, 2023Updated 3 years ago
- Geared towards life scientists wanting to be able to understand and use basic statistical and machine learning methods☆18Updated this week
- Scripts for reproducing analyses of large RNA-seq datasets☆15May 22, 2019Updated 7 years ago
- Workflow4Metabolomics meta repository☆11May 23, 2025Updated last year
- ☆13Sep 8, 2020Updated 5 years ago
- MEXPRESS is a data visualization tool designed for the visualization of TCGA expression, DNA methylation and clinical data.☆15Jul 23, 2020Updated 5 years ago
- Reproducible reanalysis of a combined ChIP-Seq & RNA-Seq data set☆17Aug 9, 2019Updated 6 years ago
- ☆11Jul 13, 2020Updated 5 years ago
- AlvinToh Learning Repository for The Ultimate Hands-On Hadoop - Tame your Big Data!☆10May 23, 2018Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Classify images of different kitchenware items☆11Apr 17, 2023Updated 3 years ago
- Preprocessing pipeline for (sc)ATAC data☆10Feb 22, 2022Updated 4 years ago
- Class materials for the NIH HPC snakemake class☆15Sep 27, 2024Updated last year
- BED QC tool (in the making)☆18Aug 19, 2022Updated 3 years ago
- Docker image of claat tool used to generate beautiful codelabs from markdown or Google doc☆18Nov 3, 2021Updated 4 years ago
- ☆14Jan 22, 2019Updated 7 years ago
- An example project that implements a data pipeline using Scala, Akka, and Spark and works with document-oriented and graph databases to l…☆11Aug 9, 2019Updated 6 years ago
- Version 2.1.0 released☆23Sep 16, 2019Updated 6 years ago
- Construction of small-world, scale-free networks☆16Mar 16, 2017Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- exRNA Biomarker Discovery for Liquid Biopsy☆11Oct 29, 2020Updated 5 years ago
- Machine Learning DevOps Engineer Nanodegree☆11Jan 27, 2022Updated 4 years ago
- How to grow engineering career☆22Jun 27, 2023Updated 2 years ago
- Data Streaming Nanodegree (from Udacity) exercises, projects and their solutions☆17Aug 14, 2023Updated 2 years ago
- Real World Project on Formula1 Racing using Azure Databricks, Delta Lake and Azure Data Factory☆13Jul 24, 2023Updated 2 years ago
- ☆40Mar 13, 2026Updated 3 months ago
- Manual deployment of JupyterHub on Kubernetes for a single machine☆15Mar 1, 2023Updated 3 years ago
- Formatted Spreadsheets to gts☆25Feb 4, 2025Updated last year
- ☆19Nov 10, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ASIO plugin for OBS-Studio☆18Nov 1, 2025Updated 7 months ago
- Code for multi-sample variant calling from sequence data of pooled or unpooled DNA samples☆20Apr 21, 2026Updated last month
- MSKCC Reis-Filho Lab pipeline thingy☆18Mar 31, 2026Updated 2 months ago
- Processing workflow for COVID-19 single cell data☆17Dec 23, 2021Updated 4 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆12Jul 16, 2019Updated 6 years ago
- satuRn is a highly performant and scalable method for performing differential transcript usage analyses.☆24Feb 28, 2023Updated 3 years ago
- Single Cell Analysis course at Cold Spring Harbor Laboratory 2017☆23Oct 19, 2017Updated 8 years ago