rueedlinger / hive-udf
Hive User-Defined Functions (UDFs) for Text Mining
☆14Updated 10 years ago
Related projects ⓘ
Alternatives and complementary repositories for hive-udf
- A collection of Hive UDFs☆75Updated 4 years ago
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 2 years ago
- ☆39Updated 5 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆41Updated 7 years ago
- Helpful user defined fuctions / table generating functions for Hive☆101Updated 8 years ago
- NexR Hive UDFs☆111Updated 9 years ago
- Notes about Spark Streaming in Apache Spark☆58Updated 7 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy☆60Updated 11 months ago
- This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.☆41Updated 8 years ago
- InputFormat that can split multi-line JSON☆49Updated 9 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Templates for projects based on top of H2O.☆37Updated 3 weeks ago
- A spark package for loading Spark ML models to Redis-ML☆63Updated 5 years ago
- Ambari View for the Ambari Store☆15Updated 9 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 10 years ago
- PMML evaluator library for the Apache Hive data warehouse software (legacy codebase)☆13Updated 9 years ago
- Code to index Hive tables to Solr and Solr indexes to Hive☆47Updated 5 years ago
- Flink Examples☆39Updated 8 years ago
- Cask Hydrator Plugins Repository☆67Updated this week
- Apache Sqoop Cookbook☆36Updated 10 years ago
- An Ambari Stack service package for VNC Server with the ability to install developer tools like Eclipse/IntelliJ/Maven as well to 'remote…☆28Updated 8 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆71Updated last year
- Kylin running in a Docker cluster☆46Updated 8 years ago
- My Personal Collection of Hive UDFs☆34Updated 11 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Demos around Ambari Views, Services, Blueprints☆63Updated 8 years ago
- Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm☆105Updated 10 months ago