msukmanowsky / omniture-data-tools
A set of tools for working with Omniture daily data files (hit_data.tsv) in big or small tools like Spark, Hadoop or just Python.
☆38Updated 5 years ago
Related projects: ⓘ
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 7 years ago
- HDP Data Science/Machine Learning demo☆37Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- ☆136Updated this week
- Coding exercises for Apache Spark☆103Updated 9 years ago
- Kite SDK Examples☆99Updated 3 years ago
- ☆33Updated this week
- ☆38Updated 6 years ago
- Mastering Spark for Data Science, published by Packt☆46Updated last year
- Oracle Data Science Bootcamp 2014☆25Updated 9 years ago
- Monitor Twitter stream for S&P 500 companies to identify & act on unexpected increases in tweet volume☆39Updated 8 years ago
- An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project…☆29Updated 8 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Updated 8 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 8 years ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Updated 7 years ago
- Code for Tutorial on designing clickstream analytics application using Hadoop☆54Updated 9 years ago
- An Ambari Stack service package for VNC Server with the ability to install developer tools like Eclipse/IntelliJ/Maven as well to 'remote…☆28Updated 8 years ago
- XML Serializer/Deserializer for Apache Hive☆41Updated 4 years ago
- Templates for projects based on top of H2O.☆37Updated last year
- Apache Zeppelin on Kubernetes.☆28Updated 5 years ago
- ☆41Updated 7 years ago
- This repository contains the Pig Latin scripts, UDFs and datasets used in the book Pig Design Patterns by Pradeep Pasupuleti, published b…☆23Updated 10 years ago
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- ☆14Updated this week
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Vagrant, Apache Spark and Apache Zeppelin VM for teaching☆45Updated 6 years ago
- ☆63Updated this week
- This is the example code repository for Getting Started with Impala by John Russell (O'Reilly Media)☆22Updated 7 years ago
- Tutorials for Cascading, Lingual, Pattern and other projects☆18Updated 8 years ago