Thoughts on things I find interesting.
☆17Dec 19, 2024Updated last year
Alternatives and similar repositories for blog
Users that are interested in blog are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 16, 2022Updated 3 years ago
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Jul 12, 2019Updated 6 years ago
- ☆35Dec 2, 2016Updated 9 years ago
- An Extensible Data Skipping Framework☆50Jul 15, 2025Updated 11 months ago
- A third party tool to simulate the calculation result of Flink's memory configuration. Valid for Flink-1.10 and Flink-1.11.☆45Oct 10, 2020Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Flink native Kubernetes Operator is a java based control plane for running Apache Flink native application on Kubernetes.☆52Jul 15, 2022Updated 3 years ago
- Next-generation Cassandra Conference, September 26, 2017☆12Aug 23, 2018Updated 7 years ago
- HDFS rsync-like utility to replicate data between HDFS clusters☆17Jun 16, 2012Updated 13 years ago
- ☆16Jun 27, 2020Updated 5 years ago
- Keap is a heap data structure presenting stable PriorityQueue and stable Keapsort sorting algorithm☆16May 17, 2026Updated 3 weeks ago
- Collection of HDP Tuning Tricks & Tips (unofficial guide)☆17Sep 26, 2017Updated 8 years ago
- Spark pipelines that correspond to a series of Dataflow examples.☆27May 5, 2019Updated 7 years ago
- ☆11Oct 11, 2022Updated 3 years ago
- ☆10Apr 13, 2020Updated 6 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆62May 29, 2019Updated 7 years ago
- 基于袋鼠云提供的开源flinkStreamSQL项目,对其实时sql进行可视化功能开发;通过tcpip通信,前端页面选择需要连接的数据库信息,并写sql语句,点击提交后,后端自动执行集群启动和JobGraph提交,并返回结果给前端页面。实现了使用者即使不了解Kafka、fl…☆10Jun 23, 2019Updated 6 years ago
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- Massively Scalable Anomaly Detection with Apache Kafka, Cassandra and Kubernetes - final code for Instaclustr's Anomalia Machina Blog ser…☆15May 22, 2019Updated 7 years ago
- ☆14Aug 23, 2015Updated 10 years ago
- Presto Gateway routes query based on policy.☆12Sep 15, 2020Updated 5 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- HBase数据库源代码学习研究(包括代码注释、文档、用于代码分析的测试用例)☆10May 18, 2017Updated 9 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16May 21, 2026Updated 3 weeks ago
- 提供了solr到elasticsearch的语法翻译引擎,兼容现有的solr语法,提供了基于注解的ORM实现☆12Oct 8, 2015Updated 10 years ago
- API REST boilerplate using Spring Boot and Redis as database☆13Dec 26, 2018Updated 7 years ago
- Example setup of Flink cluster on Kubernetes with service discovery on Prometheus.☆16Nov 30, 2019Updated 6 years ago
- ☆11Jul 18, 2021Updated 4 years ago
- Port of TPC-DS data generator to Java☆13Aug 1, 2017Updated 8 years ago
- ☆13Sep 25, 2024Updated last year
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- Example to create lineage in Atlas with sqoop and spark☆14Apr 5, 2017Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 面向单机与分布式 OLTP/OLAP 场景的可暂停的渐进式 SQL 引擎 (只用于研究)☆12May 11, 2023Updated 3 years ago
- A service to manage your Cuckoo filters☆18Mar 11, 2018Updated 8 years ago
- ACL Management for Apache Spark SQL with Apache Ranger☆17Jun 18, 2020Updated 5 years ago
- Make TIDB into the Data Lake easier☆10Jan 7, 2022Updated 4 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- A HBase datasource implementation for Spark and [MLSQL](http://www.mlsql.tech).☆15Sep 29, 2023Updated 2 years ago
- ☆491Oct 21, 2022Updated 3 years ago