Approximate cardinality estimation with HyperLogLog, as a Hive function
☆42Dec 17, 2012Updated 13 years ago
Alternatives and similar repositories for hive-udf
Users that are interested in hive-udf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An application to monitor and drive the Spark JobServer☆12Dec 12, 2014Updated 11 years ago
- Implementation of 'Recordinality' cardinality estimation sketch with distinct value sampling☆55Aug 20, 2013Updated 12 years ago
- Some Spark implementations of clustering algorithms.☆19Nov 13, 2018Updated 7 years ago
- workflow support for reproducible deduplication and merging☆16Jun 29, 2023Updated 2 years ago
- Parallel Weighted Random Sampling☆21Dec 9, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆11Nov 29, 2020Updated 5 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- An Akka Extension for easy integration of spark and cassandra in Akka micro services.☆25Sep 25, 2014Updated 11 years ago
- Random Sampling over Joins Revisited Source Code☆23Mar 21, 2023Updated 3 years ago
- native Rust implementation of Kafka protocol and api☆14Jun 13, 2023Updated 2 years ago
- A super simple utility for testing Apache Hive scripts locally for non-Java developers.☆73Feb 11, 2017Updated 9 years ago
- A GameBoy Emulator written in Rust, written as a learning project for both☆10Jun 6, 2023Updated 2 years ago
- Arithmetic coding library☆17Apr 15, 2026Updated 3 weeks ago
- Unix tee, but for Kinesis streams☆12Oct 19, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A research and review of techniques to provide a natural language interface to RDMS.☆10Dec 8, 2017Updated 8 years ago
- hdfs client impl with pure rust☆19Jan 9, 2024Updated 2 years ago
- Read SparkSQL parquet file as RDD[Protobuf]☆93Oct 12, 2018Updated 7 years ago
- A new cardinality estimation scheme for join query estimation☆42Dec 6, 2024Updated last year
- SBT plugin for creating and managing AWS CloudFormation stacks☆11Jan 8, 2018Updated 8 years ago
- Adobe Experience Platform API for humans☆36Updated this week
- Boilerplate project for MOTW Workshop 2015☆10Mar 3, 2016Updated 10 years ago
- Neural Arithmetic Logic Units(arXiv:1808.00508)☆11Aug 6, 2018Updated 7 years ago
- Embed any webapp/website as Ambari view!☆25Feb 26, 2016Updated 10 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- demo clients☆20Jul 31, 2017Updated 8 years ago
- 👑 Fully on-chain auto-battler game owned by the community☆18Jun 28, 2024Updated last year
- Algorithmic Trading Pipeline for Online Betting Markets☆19Dec 7, 2022Updated 3 years ago
- An Apache Mesos Framework that allows for replaying load over and over and over (and over) again☆10Aug 10, 2015Updated 10 years ago
- A sample implementation of the Spark Datasource API☆24Apr 15, 2017Updated 9 years ago
- P2P Sports Betting on the Ethereum blockchain☆14Mar 14, 2025Updated last year
- Building blocks and patterns for building data prep transformations and feature engineering in Spark.☆16Mar 16, 2016Updated 10 years ago
- This convolutional neural network can say if an input image is a chat screen-shot or a normal image☆22Mar 20, 2021Updated 5 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Oct 14, 2015Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆16Oct 14, 2019Updated 6 years ago
- Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...☆19Dec 7, 2017Updated 8 years ago
- Rust implementation of Apache ORC☆29Apr 29, 2026Updated last week
- Algorithmic Trading with Machine Learning☆15Sep 26, 2015Updated 10 years ago
- ☆11Aug 14, 2014Updated 11 years ago
- datascience oriented utilities: histograms, aggregations, plots, data manipulation, and other common tasks.☆40Apr 19, 2023Updated 3 years ago
- Scala library for Reactive streaming Microservices, CQRS, Event Sourcing, Event Logging, & message-driven apps.☆22Feb 19, 2018Updated 8 years ago