Data set and queries that I use in my Hive and Impala presentations. Slides are usually posted at slideshare.net/markgrover
☆20May 19, 2014Updated 11 years ago
Alternatives and similar repositories for cloudcon-hive
Users that are interested in cloudcon-hive are comparing it to the libraries listed below
Sorting:
- Apache Spark programming exercises with Python☆13Apr 18, 2021Updated 4 years ago
- In-kernel RDMA library☆12Nov 7, 2023Updated 2 years ago
- Stacking a block device over another block device☆17Oct 29, 2014Updated 11 years ago
- Analysis of SAT/ACT data per state in 2017 and 2018☆21Jun 13, 2025Updated 9 months ago
- ☆17Jul 25, 2019Updated 6 years ago
- Labs and data files for a full-day Spark workshop☆25May 24, 2025Updated 9 months ago
- On Stacking a Persistent Memory File System on Legacy File Systems [FAST '23]☆18May 18, 2023Updated 2 years ago
- Jupyter workflow example☆28Oct 1, 2020Updated 5 years ago
- 项目中保留了向开源社区提交过的patch☆16Oct 22, 2017Updated 8 years ago
- ☆31Feb 21, 2021Updated 5 years ago
- DNS lookup cache for Python using dnspython☆21Oct 26, 2021Updated 4 years ago
- This repo contains commands that data engineers use in day to day work.☆61Feb 4, 2023Updated 3 years ago
- Bring your own data Labs: Build a serverless data pipeline based on your own data☆44May 22, 2023Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Jul 6, 2022Updated 3 years ago
- React rendering for Meteor apps☆12Mar 4, 2015Updated 11 years ago
- Find out what great meetups people are going to!☆13Jul 20, 2017Updated 8 years ago
- A Vue.js project starter template w/ Tachyons, Webpack, and ESLint☆12May 9, 2017Updated 8 years ago
- All Data Engineering notebooks from Datacamp course☆115Dec 11, 2019Updated 6 years ago
- Serverless Architecture with AWS☆10Apr 15, 2016Updated 9 years ago
- ☆24Feb 26, 2024Updated 2 years ago
- Fundamentals of Spark with Python (using PySpark), code examples☆362Oct 29, 2022Updated 3 years ago
- Explore external scalers built by the community.☆12Feb 19, 2026Updated last month
- Merkle tree and other data structures.☆17Dec 30, 2022Updated 3 years ago
- Code review checklist with examples (still WIP).☆15Jul 30, 2022Updated 3 years ago
- A logging handler for Splunk. Lets you send information to Splunk directly from your Python code.☆23Jul 28, 2015Updated 10 years ago
- Problems from algo expert solved in Java☆12Jan 16, 2020Updated 6 years ago
- Automation of desktop, web, mainframe and citrix based processes using RPA tools such as BluePrism, PegaRobotics, Automaton Anywhere and …☆11Dec 9, 2017Updated 8 years ago
- Dev Ops Dashboard for Petabyte Scale AI Data Lake☆12Mar 28, 2023Updated 2 years ago
- A fast union-find data structure for Python.☆10Mar 10, 2015Updated 11 years ago
- 🔑 A service which provides continuous user authentication to web applications, using keystroke dynamics.☆12Oct 24, 2018Updated 7 years ago
- Apache RocketMQ lite cpp client☆11Jul 22, 2023Updated 2 years ago
- A personal homepage to start coding☆14Aug 1, 2019Updated 6 years ago
- ☆11Mar 15, 2017Updated 9 years ago
- Data Analysis with IBM SPSS Statistics, published by Packt☆17Jan 30, 2023Updated 3 years ago
- SoftUni course CSharp OOP Advanced: All tasks with their solutions.☆10Aug 14, 2020Updated 5 years ago
- Python design patterns (https://app.pluralsight.com/library/courses/python-design-patterns)☆13Mar 11, 2018Updated 8 years ago
- [Archived] A Fast Multi-tiered Distributed Storage System based on User-Level I/O☆74Mar 2, 2018Updated 8 years ago
- Ion Path Extraction API aims to combine the convenience of a DOM API with the speed of a streaming API.☆16Jan 9, 2025Updated last year
- Interactive HTML canvas based implementation of k-means.☆16Mar 24, 2018Updated 7 years ago