ayshaysha / aws-csv-to-parquet-converter
This Script gets CSV file from Amazon S3 using Python Library Boto3 and converts it to Parquet Format before uploading the new Parquet Version again to S3.
☆9Updated 4 years ago
Alternatives and similar repositories for aws-csv-to-parquet-converter
Users that are interested in aws-csv-to-parquet-converter are comparing it to the libraries listed below
Sorting:
- DuckDB with Dashboarding tools demo evidence, streamlit and rill☆16Updated last year
- ☆10Updated 8 months ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆16Updated 8 months ago
- Retrieval Augmented Generation, but no servers involved. Backed by S3☆10Updated last year
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated 2 years ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Updated this week
- This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenario…☆24Updated 6 months ago
- ☆11Updated 5 months ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆14Updated last month
- ☆36Updated last year
- ☆14Updated 4 years ago
- An infrastructure as code approach to deploying Snowflake using Terraform☆25Updated last year
- Run dbt serverless in the Cloud (AWS)☆42Updated 5 years ago
- Serverless Datalake architecture☆12Updated 2 years ago
- Building Event Driven Application with AWS Lambda and Amazon Redshift Data API☆17Updated 4 years ago
- ☆15Updated last month
- Source code for the post, 'Getting Started with Data Analysis on AWS, using S3, Glue, Amazon Athena, and QuickSight'☆28Updated 4 years ago
- This repository demonstrates the construction of a state-of-the-art multimodal search engine, leveraging Amazon Titan Embeddings, Amazon …☆31Updated this week
- ☆12Updated last year
- ☆20Updated last month
- Using DuckDB with AWS Lambda to process Delta Lake data☆26Updated 3 months ago
- aws-solutions-library-samples / guidance-for-preparing-and-validating-records-for-entity-resolution-on-awsThis Guidance demonstrates how to prepare and validate Personally Identifiable Information (PII) data, including physical address, phone,…☆9Updated 6 months ago
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆22Updated last year
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆50Updated last year
- This repository contains a series of 4 jupyter notebooks demonstrating how AWS AI Services like Amazon Rekognition, Amazon Transcribe and…☆11Updated 3 years ago
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆36Updated 3 months ago
- ☆32Updated last year
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 8 months ago
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆23Updated 8 months ago