Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.
☆129May 9, 2020Updated 5 years ago
Alternatives and similar repositories for Avro2TF
Users that are interested in Avro2TF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A scalable machine learning library on Apache Spark☆797Aug 30, 2021Updated 4 years ago
- TonY is a framework to natively run deep learning frameworks on Apache Hadoop.☆710Oct 14, 2023Updated 2 years ago
- Hadoop Yarn aggregated log parser utility☆23Feb 1, 2020Updated 6 years ago
- Apache Spark - A unified analytics engine for large-scale data processing☆16Jul 24, 2023Updated 2 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23May 7, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Read and write Tensorflow TFRecord data from Apache Spark.☆298Apr 22, 2024Updated 2 years ago
- Big Data Processing Framework - Unified Data API or SQL on Any Storage☆252Jul 10, 2025Updated 9 months ago
- Tumor Phylogeny Reconstruction via Integrative use of Single Cell and Bulk Sequencing Data☆11Jul 13, 2020Updated 5 years ago
- Grok Expression Transform for Kafka Connect.☆16Feb 8, 2021Updated 5 years ago
- Oh no! Yet another Kafka operator for Kubernetes☆19Updated this week
- ☆17Feb 16, 2020Updated 6 years ago
- Scala Aggregators used for ML Model metrics monitoring☆92Sep 13, 2023Updated 2 years ago
- NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.☆122Nov 25, 2025Updated 5 months ago
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆306Oct 30, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code and data for SciPy 2018 talk on missing data☆21Jun 29, 2018Updated 7 years ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,370Aug 22, 2023Updated 2 years ago
- A distributed Spark/Scala implementation of the isolation forest and extended isolation forest algorithms for unsupervised outlier detect…☆256Apr 18, 2026Updated last week
- Short Text Similarity as described in https://dl.acm.org/citation.cfm?id=2806475☆17Feb 7, 2019Updated 7 years ago
- An extensible distributed system for reliable nearline data streaming at scale☆961Mar 17, 2026Updated last month
- ☆11Sep 17, 2020Updated 5 years ago