TeraSort for Spark and Flink which uses a range partitioner based on sampling
☆22Feb 5, 2016Updated 10 years ago
Alternatives and similar repositories for terasort
Users that are interested in terasort are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An AI agent to create short stories, using Gemini and Imagen for illustrations. The project is developed in Java 21 with LangChain4j, and…☆14Sep 4, 2025Updated 8 months ago
- A set of base classes in order to perfom training scripts for Neural Networs ( by means of SNNS) and SVM ( by means of SVM Light and SVM …☆14Jun 24, 2011Updated 14 years ago
- SSM框架构建商城+论坛☆15Jun 30, 2018Updated 7 years ago
- Dependency and data pipeline management framework for Spark and Scala☆15Apr 8, 2017Updated 9 years ago
- Java's NIO APIs cache direct ByteBuffers, causing a native memory leak.☆21Jan 3, 2016Updated 10 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- spark MLlib机器学习实践源码☆10Oct 28, 2016Updated 9 years ago
- dllib is a distributed deep learning library running on Apache Spark☆32Oct 26, 2017Updated 8 years ago
- Demo of DuckDB Spark API implements. Same Pyspark code, but DuckDB under the hood☆15Nov 16, 2023Updated 2 years ago
- simbot框架下,mirai组件的springboot快速启动器(starter)☆12Jan 1, 2022Updated 4 years ago
- Influence Maximization Paper List☆11May 11, 2022Updated 3 years ago
- Spark Terasort☆122Apr 21, 2023Updated 3 years ago
- Enhanced Jenkins with Docker, Mesos and Marathon☆10Jun 29, 2015Updated 10 years ago
- Kira is an astronomy image processing toolkit implemented with Apache Spark.☆15Feb 9, 2016Updated 10 years ago
- 基于深度学习-卷积神经网络训练而成的模型来动态识别手写体数字识别, 准确率达到:99.64%☆12Mar 23, 2020Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The ISC Anomaly Detection and Classification Framework implemented for Apache Flink.☆13Dec 14, 2016Updated 9 years ago
- ☆20Jun 12, 2020Updated 5 years ago
- From this paper: Density-based clustering for real-time stream data☆10Jan 7, 2017Updated 9 years ago
- Basic dynamically loadable extension for HHVM☆30Nov 30, 2016Updated 9 years ago
- Helm Chart for lyft/flinkk8soperator☆11Mar 10, 2020Updated 6 years ago
- Using Google BERT to classify biomedical papers☆12Mar 22, 2019Updated 7 years ago
- Java port of wolfgarbe/PruningRadixTrie☆16Jun 29, 2021Updated 4 years ago
- ☆17May 25, 2015Updated 10 years ago
- MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regr…☆12Apr 10, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Next-generation Cassandra Conference, September 26, 2017☆12Aug 23, 2018Updated 7 years ago
- Scripts to analyze Spark's performance☆136May 20, 2018Updated 7 years ago
- 简易的模型监控界面:定期更新的用户信用分及特征分布☆16Jan 12, 2018Updated 8 years ago
- multi objective, single objective optimization, genetic algorithm for multi-objective optimization, particle swarm intelligence, ... impl…☆15May 17, 2020Updated 5 years ago
- ☆111Jan 27, 2026Updated 3 months ago
- A tool visualization of Tree(Query Plan) in Postgresql☆14May 15, 2023Updated 2 years ago
- Eclipse integration for the PHP Dependency Manager Composer☆47Feb 11, 2020Updated 6 years ago
- Custom Service for deploying Apache Alluxio on a running HDP 2.3 / IOP 4.1 Ambari Managed Cluster☆13Jan 13, 2017Updated 9 years ago
- real time log event processing using spark, kafka & cassandra☆13Dec 4, 2014Updated 11 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Examples for Java Concurrency Stress (jcstress) tests with gradle integration☆17Oct 8, 2017Updated 8 years ago
- Distributed execution for duckdb queries.☆57Mar 10, 2026Updated last month
- Java Code for Paper: Variable-Length Particle Swarm Optimisation for Feature Selection on High-Dimensional Classification☆10Jul 2, 2020Updated 5 years ago
- Run Samza as a Spring Boot application☆18Mar 6, 2017Updated 9 years ago
- Temporal IMLinUCB - a solution for Online Influence Maximization problem in Temporal Networks (based on IMLinUCB)☆17May 3, 2024Updated 2 years ago
- Multi-objective particle swarm optimization algorithm in .m☆12May 9, 2020Updated 5 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago