Thoughts on things I find interesting.
☆17Dec 19, 2024Updated last year
Alternatives and similar repositories for blog
Users that are interested in blog are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 16, 2022Updated 3 years ago
- A JMM Cookbook for Java Developers(as opposed to a cookbook for Compiler Writers)☆12Jun 13, 2014Updated 11 years ago
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Jul 12, 2019Updated 6 years ago
- ☆35Dec 2, 2016Updated 9 years ago
- An Extensible Data Skipping Framework☆48Jul 15, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A third party tool to simulate the calculation result of Flink's memory configuration. Valid for Flink-1.10 and Flink-1.11.☆45Oct 10, 2020Updated 5 years ago
- Flink native Kubernetes Operator is a java based control plane for running Apache Flink native application on Kubernetes.☆52Jul 15, 2022Updated 3 years ago
- Next-generation Cassandra Conference, September 26, 2017☆12Aug 23, 2018Updated 7 years ago
- HDFS rsync-like utility to replicate data between HDFS clusters☆17Jun 16, 2012Updated 13 years ago
- Library which aim to generate kubernetes yamls templates from an Airflow dag using the Airflow Kuberntes Pod Operator☆10May 6, 2021Updated 5 years ago
- Keap is a heap data structure presenting stable PriorityQueue and stable Keapsort sorting algorithm☆14Jan 30, 2024Updated 2 years ago
- Collection of HDP Tuning Tricks & Tips (unofficial guide)☆17Sep 26, 2017Updated 8 years ago
- ☆11Oct 11, 2022Updated 3 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated 2 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆62May 29, 2019Updated 6 years ago
- ☆20Mar 9, 2026Updated last month
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- Massively Scalable Anomaly Detection with Apache Kafka, Cassandra and Kubernetes - final code for Instaclustr's Anomalia Machina Blog ser…☆15May 22, 2019Updated 6 years ago
- ☆12Oct 16, 2023Updated 2 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- HBase数据库源代码学习研究(包括代码注释、文档、用于代码分析的测试用例)☆10May 18, 2017Updated 8 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 提供了solr到elasticsearch的语法翻译引擎,兼容现有的solr语法,提供了基于注解的ORM实现☆12Oct 8, 2015Updated 10 years ago
- Example setup of Flink cluster on Kubernetes with service discovery on Prometheus.☆16Nov 30, 2019Updated 6 years ago
- ☆11Jul 18, 2021Updated 4 years ago
- API REST boilerplate using Spring Boot and Redis as database☆13Dec 26, 2018Updated 7 years ago
- Port of TPC-DS data generator to Java☆13Aug 1, 2017Updated 8 years ago
- 录制Spak视频课程讲解涉及编写的源代码 https://edu.hellobi.com/course/107/overview☆13Apr 23, 2019Updated 7 years ago
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- It is a kind of big data computing platform which is driven by the Flink SQL. In particular, it provides the SQL programming.☆21Jan 5, 2023Updated 3 years ago
- Example to create lineage in Atlas with sqoop and spark☆14Apr 5, 2017Updated 9 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 面向单机与分布式 OLTP/OLAP 场景的可暂停的渐进式 SQL 引擎 (只用于研究)☆12May 11, 2023Updated 2 years ago
- PostgreSQL Lance Table Extension☆25Dec 27, 2025Updated 4 months ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Jul 11, 2018Updated 7 years ago
- A service to manage your Cuckoo filters☆18Mar 11, 2018Updated 8 years ago
- Make TIDB into the Data Lake easier☆10Jan 7, 2022Updated 4 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- A HBase datasource implementation for Spark and [MLSQL](http://www.mlsql.tech).☆15Sep 29, 2023Updated 2 years ago