cartershanklin/csv-to-orc

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cartershanklin/csv-to-orc)

cartershanklin / csv-to-orc

Convert a CSV fle to ORCFile

☆26

Alternatives and similar repositories for csv-to-orc

Users that are interested in csv-to-orc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jdye64 / docker-hwx
View on GitHub
Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components
☆10Oct 11, 2019Updated 6 years ago
datapunchorg / spark-ui-reverse-proxy
View on GitHub
This project provides a reverse proxy for Spark UI on Kubernetes
☆16Oct 12, 2023Updated 2 years ago
youngwookim / awesome-presto
View on GitHub
A curated list of awesome PrestoDB / Trino software, libraries, tools and resources
☆18Jun 28, 2021Updated 5 years ago
tugul / CoreJava
View on GitHub
Konzepte von Core-Java 8 werden durch beispiele illustriert. Java 8's core concepts are explained by examples.
☆12Oct 12, 2018Updated 7 years ago
jpplayer / hdfs-auto-snapshot
View on GitHub
HDFS Automatic Snapshot Service for Linux
☆11Oct 17, 2016Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zaratsian / HDP_Tuning_Unofficial
View on GitHub
Collection of HDP Tuning Tricks & Tips (unofficial guide)
☆17Sep 26, 2017Updated 8 years ago
miguel10 / YARN-Memory-Calculator
View on GitHub
Hadoop YARN & MapReduce Memory Calculator
☆13Nov 9, 2015Updated 10 years ago
UrbanOS-Public / kdp
View on GitHub
Kubernetes deployment of PrestoDB, Hive Metastore, and Minio S3-standard object store
☆17Oct 20, 2022Updated 3 years ago
deric / kafka-manager-docker
View on GitHub
kafka-manager in Docker container
☆19Dec 23, 2020Updated 5 years ago
tzolov / zeppelin-ambari-plugin
View on GitHub
Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.
☆14Jan 23, 2016Updated 10 years ago
sainib / hadoop-data-pipeline
View on GitHub
Hadoop Data Pipeline using Falcon
☆15May 3, 2016Updated 10 years ago
oracle / spark-oracle
View on GitHub
On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.
☆35Apr 15, 2025Updated last year
hortonworks-spark / spark-llap
View on GitHub
☆102Mar 23, 2020Updated 6 years ago
Azure / spark-cdm
View on GitHub
A Spark connector for the Azure Common Data Model
☆15May 31, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
qubole / spark-acid
View on GitHub
ACID Data Source for Apache Spark based on Hive ACID
☆97Jul 7, 2021Updated 4 years ago
Huawei-Spark / Backup-Repo
View on GitHub
The released version of Astro(Spark SQL on HBase) has been moved to:
☆16Jul 23, 2015Updated 10 years ago
wushujames / kafka-connector-skeleton
View on GitHub
A fork of the Apache Kafka "connect-file" Kafka Connect, to use as a starting point to write your own Kafka connectors.
☆37Feb 28, 2018Updated 8 years ago
odpi / specs
View on GitHub
ODPi specifications, developed by ODPi Runtime and ODPi Operations projects. Currently in Emeritus status
☆35Feb 12, 2019Updated 7 years ago
alexjbush / ansible-hadoop-asap
View on GitHub
Ansible playbook for automated HDP 2.x deployment install with Kerberos
☆19Sep 8, 2016Updated 9 years ago
hortonworks-spark / cloud-integration
View on GitHub
Spark cloud integration: tests, cloud committers and more
☆20Jan 30, 2025Updated last year
pulsarIO / jetstream
View on GitHub
Jetstream is a streaming processing framework
☆115Sep 16, 2015Updated 10 years ago
seanorama / ambari-bootstrap
View on GitHub
Collection of tools for bootstrapping Apache Ambari & deploying clusters
☆83Apr 17, 2019Updated 7 years ago
765276707 / straws
View on GitHub
Straws是一款开源的离线数据同步中间件(ETL)，提供Mysql、SqlServer等离线同步场景，同时支持定时同步（全量、增量、CDC三种模式）和数据转换清洗等功能
☆11Jul 31, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
datacontract / open-data-contract-standard-excel-template
View on GitHub
Edit Open Data Contract Standard in Excel
☆40Apr 1, 2026Updated 3 months ago
lintool / SparkTutorial
View on GitHub
Spark Tutorial at the University of Maryland
☆37Oct 24, 2014Updated 11 years ago
xeruf / nodal
View on GitHub
My experiences with & wishes for task management
☆14Dec 31, 2023Updated 2 years ago
hoch / motw-2015
View on GitHub
Boilerplate project for MOTW Workshop 2015
☆10Mar 3, 2016Updated 10 years ago
timveil / docker-hadoop
View on GitHub
Simple functional examples of running Hadoop + Hive in Docker with Docker Compose
☆24Dec 25, 2022Updated 3 years ago
nonodename / duck_rdf
View on GitHub
RDF file extension for DuckDB. Reads and writes supported
☆21Jun 26, 2026Updated last week
whomm / bigdata-tech-index
View on GitHub
Big Data Technology Index
☆25Dec 18, 2019Updated 6 years ago
flokkr / docker-hadoop
View on GitHub
Docker image for main Apache Hadoop components (Yarn/Hdfs)
☆56Dec 10, 2022Updated 3 years ago
gesellix / inject-docker-certs
View on GitHub
Adding certificates to the Docker for Mac beta
☆28Nov 30, 2016Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alopresto / slides
View on GitHub
Presentations and other resources.
☆36Jul 13, 2020Updated 5 years ago
maropu / hivemall-spark
View on GitHub
A Hivemall wrapper for Spark
☆31Apr 21, 2016Updated 10 years ago
onetapbeyond / opencpu-spark-executor
View on GitHub
Apache Spark OpenCPU Executor (ROSE)
☆25Jun 16, 2018Updated 8 years ago
SQLMCT / SQL_Performance_Tuning
View on GitHub
☆15Apr 14, 2026Updated 2 months ago
memsql / streamliner-starter
View on GitHub
Starter project for building MemSQL Streamliner Pipelines
☆32Apr 18, 2017Updated 9 years ago
matrixorigin / matrixorigin.io.cn
View on GitHub
☆12Jun 25, 2026Updated last week
oranda / treelog-scalajs
View on GitHub
Gives TreeLog a GUI, the ScalaJS ReactTreeView
☆10Jun 23, 2016Updated 10 years ago