☆26Mar 18, 2016Updated 10 years ago
Alternatives and similar repositories for hadoopUtils
Users that are interested in hadoopUtils are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Jupyter Notebooks for Data Science☆12Jan 12, 2017Updated 9 years ago
- Few scripts to automate daily data loads from RDBMS to Partitioned Avro Hive table☆30Sep 25, 2014Updated 11 years ago
- Collection of Pig scripts that I use for my talks and workshops☆39Apr 30, 2013Updated 12 years ago
- ☆44Jul 24, 2017Updated 8 years ago
- The http://analyticsdojo.com open source codebase and curriculum. Learn to data science today.☆38Dec 13, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- MySQL to NoSQL real time dataflow☆19Oct 14, 2017Updated 8 years ago
- ☆195Jun 21, 2022Updated 3 years ago
- QA dashboard for DV360 advertisers☆13Jan 20, 2021Updated 5 years ago
- ☆62Oct 17, 2025Updated 6 months ago
- Source code for 'PySpark Recipes' by Raju Kumar Mishra☆26Nov 30, 2019Updated 6 years ago
- Implementation of Tyler Neylon's Locality-Specific Hash based on simplex tesselations☆28Oct 15, 2011Updated 14 years ago
- ☆23Nov 17, 2022Updated 3 years ago
- sample oozie workflows☆17Jun 13, 2017Updated 8 years ago
- SmartTune is a black-box optimization that can automatically find good performance settings for a complex system's configuration knobs.☆11Nov 23, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A reusable workflow to show how to orchestrate many iterations of an action concurrently, in a single pane of glass. See medium write-up …☆12Nov 8, 2024Updated last year
- Code to support Databases blog post - How to offload data from your transactional NoSQL database to Amazon S3, perform advanced analytics…☆15Mar 26, 2020Updated 6 years ago
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- Apache Beam example project☆13Oct 16, 2019Updated 6 years ago
- Swimlane graphs for Hive, SparkSQL, and Presto based on Ganglia resource graphs☆13Feb 13, 2017Updated 9 years ago
- Airflow script for incremental data import from Mysql to Hive using Sqoop.☆18Jun 6, 2018Updated 7 years ago
- This repository has the code from the text and the videos for "Introduction to Programming and Problem Solving using Scala".☆30Feb 11, 2018Updated 8 years ago
- Glue Python Shell Job that adds AWS Organizations account tags to Cost and Usage Reports. You can submit feedback & requests for changes…☆16Mar 14, 2021Updated 5 years ago
- This is the collection of some handy tips running Nexus Repository Manager OSS☆14Aug 20, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆21Jun 23, 2019Updated 6 years ago
- Example script to deploy DAGs to Google Cloud Composer.☆15Jun 30, 2022Updated 3 years ago
- Data and example code for Programming Pig, by Alan F. Gates☆186Oct 15, 2016Updated 9 years ago
- ☆14Aug 10, 2021Updated 4 years ago
- Reference Architectures for Apache Spark☆38Jan 23, 2017Updated 9 years ago
- unopinionated framework for React based admin applications☆10May 4, 2021Updated 4 years ago
- Minimal app for demonstrating use of flask-security☆18Jul 6, 2018Updated 7 years ago
- Unsupported - Event-driven cross-site app promotion utility using the notification endpoint of the QRS API and Python.☆14Feb 1, 2021Updated 5 years ago
- Code for my videos on big data analytics with Apache Spark using Scala.☆62Feb 11, 2018Updated 8 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- scala and spark examples project☆14Feb 19, 2018Updated 8 years ago
- Generate a Redshift .manifest file for a given S3 bucket☆21Nov 16, 2017Updated 8 years ago
- This repository is created for TechCommanders and O'Reilly Students who have taken the Google Cloud Professional Security Engineer Crash …☆16Jul 27, 2021Updated 4 years ago
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆34Oct 18, 2020Updated 5 years ago
- Tutorial for Cloud Dataflow☆17Mar 12, 2019Updated 7 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Oct 20, 2017Updated 8 years ago
- Given a file and a chunk size in megabytes, calculates what the Amazon S3 etag will be.☆16Aug 7, 2020Updated 5 years ago