awslabs / dqdlLinks
☆22Updated last month
Alternatives and similar repositories for dqdl
Users that are interested in dqdl are comparing it to the libraries listed below
Sorting:
- ☆39Updated this week
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Updated 2 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Spark Accelerator framework ; It enables secondary indices to remote data stores.☆38Updated last week
- Amundsen Gremlin☆21Updated 3 years ago
- Multi-hop declarative data pipelines☆122Updated this week
- Lightweight storage for Trino views☆16Updated last week
- A Apache Hive SerDe (short for serializer/deserializer) for the Ion file format.☆31Updated 8 months ago
- A leightweight UI for Lakekeeper☆15Updated this week
- 📈 Get detailed performance metrics from your cluster independently of the Java Virtual Machine (JVM)☆46Updated last week
- Cloud Storage Connector integrates Apache Pulsar with cloud storage.☆29Updated 5 months ago
- Resilient data pipeline framework running on Apache Spark☆25Updated this week
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- ☆18Updated 3 years ago
- Apache iceberg Spark s3 examples☆20Updated last year
- ☆32Updated 2 weeks ago
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- Java bindings for the Cedar language☆65Updated this week
- ☆21Updated 5 months ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆37Updated last year
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 3 years ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆39Updated 9 months ago
- A one-afternoon implementation of redis-like primitives with S3 Express☆33Updated last year
- 🗃 Automate periodic data operations, such as deleting indices at a certain age or performing a rollover at a certain size☆70Updated last week
- Dione - a Spark and HDFS indexing library☆52Updated last month
- A testing framework for Trino☆26Updated 8 months ago
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Updated 2 years ago
- A tool to benchmark L (loading) workloads within ETL workloads☆29Updated this week
- Continuously synchronize directories from remote object store to local filesystem☆108Updated 2 weeks ago
- Paper: A Zero-rename committer for object stores☆20Updated last month