A Spark datasource for the HadoopOffice library
☆36Sep 29, 2025Updated 8 months ago
Alternatives and similar repositories for spark-hadoopoffice-ds
Users that are interested in spark-hadoopoffice-ds are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)☆63Sep 29, 2025Updated 8 months ago
- A Spark datasource for the HadoopCryptoLedger library☆13Sep 29, 2025Updated 8 months ago
- Apache Spark ETL Utilities☆39Oct 23, 2024Updated last year
- A Spark plugin for reading and writing Excel files☆521May 13, 2026Updated last month
- Microservices with spring-boot and Machine Learning with Apache Spark ML☆13Sep 15, 2018Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Spark projects. Learning book "Machine Learning with Spark"☆10Jun 3, 2017Updated 9 years ago
- An easy way to run code on other machines using IPFS Pubsub as the message queue, AWS's Python3.7 Lambda Docker Container for execution☆10May 26, 2019Updated 7 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Repository for Complex Systems model of the Grassroots Economics Community Inclusion Currency project.☆11May 15, 2026Updated last month
- ☆13Nov 10, 2022Updated 3 years ago
- 数仓项目☆10Mar 25, 2019Updated 7 years ago
- [student project] UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions☆12Apr 21, 2020Updated 6 years ago
- ACID Data Source for Apache Spark based on Hive ACID☆97Jul 7, 2021Updated 4 years ago
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Oct 1, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Collection of HDP Tuning Tricks & Tips (unofficial guide)☆17Sep 26, 2017Updated 8 years ago
- Verify that all reachable code links and will not fail at runtime with a linkage error☆10Jul 30, 2022Updated 3 years ago
- Examples and Quick Starts for Snowflake☆11Jun 9, 2026Updated last week
- A Hubot script for creating quick reminders through natural language.☆11Jun 29, 2017Updated 8 years ago
- Files and scripts for the SUSE MicroOS part☆17Mar 9, 2026Updated 3 months ago
- Apache Spark Connector for Azure Kusto☆81Updated this week
- Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems.…☆11Jul 29, 2017Updated 8 years ago
- A repository of strategies that can be used to automate intra day trades in the National Stock Exchange using the KiteConnect API by Zero…☆16Apr 19, 2021Updated 5 years ago
- ☆30Apr 6, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Hadoop, MapReduce, HDFS, Spark, Pig, Hive, HBase, MongoDB, Cassandra, Flume - the list goes on! Over 25 technologies.☆10Jan 1, 2018Updated 8 years ago
- Scrapy exporter for Big Data formats☆16Mar 10, 2026Updated 3 months ago
- 已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.☆58Nov 11, 2021Updated 4 years ago
- Terraform Module to create a Apache Zookeeper cluster on AWS☆13Jan 3, 2022Updated 4 years ago
- A Spark connector for the Azure Common Data Model☆15May 31, 2023Updated 3 years ago
- kamon netty integration☆10Aug 30, 2020Updated 5 years ago
- Ambari service for RedHat FreeIPA☆11Sep 30, 2016Updated 9 years ago
- python script to repair the primary range of a node in N discrete steps☆12Aug 3, 2018Updated 7 years ago
- List of playbooks to manage Ambari☆13Oct 3, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A microlibrary for reliable and persistent webhook delivery☆19Oct 13, 2023Updated 2 years ago
- Extract data from SAP applications using Operational Data Provisioning☆10Jul 19, 2023Updated 2 years ago
- Scripts to run VOACAP P2P prediction matrix, plot maps and store to database☆17Oct 19, 2025Updated 7 months ago
- Sample demonstrating consuming Amazon Cognito Streams☆10Jun 15, 2020Updated 6 years ago
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Jan 11, 2017Updated 9 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Jun 28, 2020Updated 5 years ago
- Data Quality Monitoring Tool☆15Dec 5, 2017Updated 8 years ago