Parquet-based ML data format optimized for working with unstructured data
☆141Jan 5, 2023Updated 3 years ago
Alternatives and similar repositories for rikai
Users that are interested in rikai are comparing it to the libraries listed below
Sorting:
- Liga: Let Data Dance with ML Models☆13Sep 12, 2023Updated 2 years ago
- Processing videos on Apache Spark☆12Feb 14, 2022Updated 4 years ago
- Helm charts for databend☆19Aug 1, 2025Updated 7 months ago
- Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data ve…☆6,123Mar 2, 2026Updated last week
- Cache server :)☆32Sep 5, 2023Updated 2 years ago
- Flink dynamic CEP demo☆20Mar 22, 2022Updated 3 years ago
- Amundsen Gremlin☆22Aug 26, 2022Updated 3 years ago
- On top of SemanticUI, this Scala.js project provides components defined in Ant Design with Binding.scala☆15Jan 1, 2019Updated 7 years ago
- write WeApp with scalajs☆19Dec 31, 2018Updated 7 years ago
- Golang driver for databend cloud☆21Jan 9, 2026Updated 2 months ago
- ☆21Apr 21, 2023Updated 2 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Jan 3, 2023Updated 3 years ago
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- docker scripts to build and run a minimal version of TDengine☆10Jul 17, 2019Updated 6 years ago
- Quick & Dirty cli to process mysql dumps☆10Sep 30, 2022Updated 3 years ago
- Client libraries of end users of Apache Kyuubi☆11Jan 10, 2023Updated 3 years ago
- Python and Scala APIs for enhanced Spark analytics☆12Mar 15, 2017Updated 8 years ago
- This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python. PyJava introduces Apache A…☆49Apr 21, 2023Updated 2 years ago
- ☆12Mar 12, 2021Updated 4 years ago
- TPCH benchmark tool for databend☆11Nov 15, 2022Updated 3 years ago
- ☆10Nov 11, 2019Updated 6 years ago
- Clink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators th…☆30Feb 21, 2022Updated 4 years ago
- 📙 Notebooks Academy: Write Production-Ready Code From Jupyter.☆13Jan 5, 2023Updated 3 years ago
- The Databend plugin for dbt (data build tool)☆12Mar 17, 2023Updated 2 years ago
- sql解析和执行,能够执行hive, spark, flink, 以及对应对TensorFlow, Deeplearning4j的算法SQL执行☆11Sep 16, 2022Updated 3 years ago
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆17Jan 4, 2026Updated 2 months ago
- Run Github Actions workflows locally or on a custom backend☆17Mar 17, 2025Updated 11 months ago
- A slab allocator with stable references☆15Jan 23, 2023Updated 3 years ago
- Distributed SQL base Realtime Streaming Computation Framework On Apache Storm, Spark☆12Mar 14, 2016Updated 9 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- a hyper-optimized single-node(local) version of spark sql engine, which's fundamental data structure is scala Iterator rather than RDD.☆13Jun 13, 2023Updated 2 years ago
- Implementation of S3-FIFO cache algorithm☆16Aug 30, 2023Updated 2 years ago
- ☆18Apr 12, 2025Updated 10 months ago
- Examples of using SparklingPandas and Pandas with PySpark☆16Aug 6, 2015Updated 10 years ago
- Helpers for setting up an embedded Python interpreter☆19Oct 31, 2025Updated 4 months ago
- Pushdown cache for DataFusion☆387Updated this week
- Plugin to accelerate Spark SQL with the NEC Vector Engine.☆19Aug 15, 2022Updated 3 years ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Feb 21, 2023Updated 3 years ago
- Data self exporting and monitoring platform based on Hive data warehouse. https://hc.smartloli.org☆36Jul 28, 2017Updated 8 years ago