koeninger / spark-citusLinks
Integrate Apache Spark with Citus distributed Postgres
☆17Updated 6 years ago
Alternatives and similar repositories for spark-citus
Users that are interested in spark-citus are comparing it to the libraries listed below
Sorting:
- INACTIVE: A PostgreSQL extension to produce messages to Apache Kafka.☆112Updated 10 years ago
- kafka foreign database wrapper for postresql☆111Updated 2 months ago
- PostgreSQL foreign data wrapper for HDFS☆140Updated 3 months ago
- INACTIVE: A PostgreSQL logical decoder output plugin to deliver data as Protocol Buffers☆126Updated 3 years ago
- something to help you spark☆64Updated 7 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆84Updated 3 years ago
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 3 years ago
- functionstest☆33Updated 9 years ago
- TopN is an open source PostgreSQL extension that returns the top values in a database according to some criteria☆247Updated last year
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29Updated 5 years ago
- ☆21Updated 2 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 9 years ago
- Time series analysis with Apache Spark based on Chronix |☆38Updated 8 years ago
- This document attempts to capture useful patterns and warn about subtle gotchas when it comes to designing and evolving schemas for long-…☆13Updated 8 years ago
- Kafka sink connector for streaming messages to PostgreSQL☆92Updated 5 years ago
- A Kafka-Connect Sink for S3 with no Hadoop dependencies.☆57Updated 2 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆42Updated 3 years ago
- PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp☆90Updated 6 months ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- Cascading on Apache Flink®☆54Updated last year
- Experiments with the GDELT dataset and Cassandra schemas.☆25Updated 9 years ago
- Quark is a data virtualization engine over analytic databases.☆100Updated 8 years ago
- Messing with PostgreSQL network traffic to make some usefull things☆94Updated 4 years ago
- ☆162Updated 3 weeks ago
- Tools for working with parquet, impala, and hive☆134Updated 5 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 7 years ago
- A small library of hive UDFS using Macros to process and manipulate complex types☆15Updated 3 months ago
- ☆24Updated 4 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 6 years ago
- Space-Filling Curves in Scala☆26Updated 5 years ago