This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.
☆57Jun 10, 2018Updated 7 years ago
Alternatives and similar repositories for Learn-Hadoop-and-Spark
Users that are interested in Learn-Hadoop-and-Spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Jan 2, 2023Updated 3 years ago
- POC for all the stack of big data (kafka, spark, cassandra, hdfs, docker, springboot)☆12Dec 16, 2022Updated 3 years ago
- Extract, transform, and load data for analytic processing using AWS Glue☆17May 2, 2021Updated 4 years ago
- A consumer of a Kafka topic based on Flink☆12Oct 5, 2022Updated 3 years ago
- Full-Text Search System to stream, collect, clean, store and filter data collected from different sources using Docker, Kafka, Elasticsea…☆21Jan 7, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Apache-kafka-spark-streaming-poc☆10Mar 19, 2017Updated 9 years ago
- 【易车】- Spark、flink、HBase、Hive、flume集成了一些Hadoop的原生api的一些demo(如HDFS、MapReduce:目前就这两个);同时测试一些异常功能☆16Apr 4, 2019Updated 6 years ago
- Collection of Pig scripts that I use for my talks and workshops☆39Apr 30, 2013Updated 12 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Jan 22, 2019Updated 7 years ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- Learning PySpark video series☆11Mar 5, 2018Updated 8 years ago
- Apache Spark using SQL☆14Aug 18, 2021Updated 4 years ago
- Listing my favorite research papers 📝 from different fields as I read them.☆10Oct 17, 2019Updated 6 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆13Oct 15, 2020Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆10May 18, 2019Updated 6 years ago
- Self-contained examples using Apache Spark with the functional features of Java 8☆66Apr 8, 2018Updated 7 years ago
- This repository is to help with the Partner Demonstration of the Apache Atlas project.☆30Oct 29, 2015Updated 10 years ago
- Apache Spark Guide☆35Feb 1, 2022Updated 4 years ago
- Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems.…☆11Jul 29, 2017Updated 8 years ago
- ☆13Feb 18, 2022Updated 4 years ago
- Terraform Module to create a Apache Zookeeper cluster on AWS☆13Jan 3, 2022Updated 4 years ago
- Elastic 2022☆11Feb 14, 2022Updated 4 years ago
- Social Media Analysis, scalable solution, flexible deployment that analyses social media contents☆10Jul 20, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Ambari service for RedHat FreeIPA☆11Sep 30, 2016Updated 9 years ago
- python script to repair the primary range of a node in N discrete steps☆12Aug 3, 2018Updated 7 years ago
- Resilient Automation Functions and Scripts☆15Jan 5, 2022Updated 4 years ago
- Botoflow is an asynchronous framework for Amazon SWF that helps you build SWF applications using Python☆13Dec 26, 2022Updated 3 years ago
- Collection of best practices for Java persistence performance in Spring Boot applications☆10Nov 27, 2019Updated 6 years ago
- List of playbooks to manage Ambari☆13Oct 3, 2018Updated 7 years ago
- A Python script to swoop and decrypt passwords from Chrome's local storage.☆11Dec 10, 2018Updated 7 years ago
- Grafana Prometheus exporter☆10Oct 17, 2017Updated 8 years ago
- ☆18Nov 16, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code used in the experiments of the paper COEGAN: Evaluating the Coevolution Effect in Generative Adversarial Networks http://gecco-2019.…☆14Jul 6, 2023Updated 2 years ago
- Implement common statistical machine learning algorithms with raw Numpy.☆16Jun 30, 2020Updated 5 years ago
- SQL on HBase with Apache Phoenix in Docker☆29Mar 21, 2016Updated 10 years ago
- This project is to integration HP ALM and other test automation frameworks.☆10May 25, 2020Updated 5 years ago
- Public GitHub repo for SciPy 2022 tutorial (Introduction to Numerical Computing With NumPy)☆14Aug 24, 2022Updated 3 years ago
- Docker Image packaging for Pentaho BI Server☆10Jul 6, 2015Updated 10 years ago
- Python scripts for Agisoft Photoscan☆12Jun 18, 2015Updated 10 years ago