Parquet-based ML data format optimized for working with unstructured data
☆141Jan 5, 2023Updated 3 years ago
Alternatives and similar repositories for rikai
Users that are interested in rikai are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Processing videos on Apache Spark☆13Feb 14, 2022Updated 4 years ago
- Liga: Let Data Dance with ML Models☆13Sep 12, 2023Updated 2 years ago
- JupyterLab extensions developed by Tubi including nteract data explorer, shareable link and deep copy/cut/paste☆19Jan 5, 2023Updated 3 years ago
- Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data ve…☆6,653Updated this week
- 📙 Notebooks Academy: Write Production-Ready Code From Jupyter.☆13Jan 5, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Amundsen Gremlin☆22Aug 26, 2022Updated 3 years ago
- ☆21Apr 21, 2023Updated 3 years ago
- Yet another task management flow.☆14May 17, 2019Updated 7 years ago
- DataFuse operator manages fuse-query and fuse-store clusters atop Kubernetes using CRDs.☆13Jul 4, 2022Updated 3 years ago
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- Jupyter notebooks containing time series analysis demos☆18Mar 30, 2026Updated 2 months ago
- On top of SemanticUI, this Scala.js project provides components defined in Ant Design with Binding.scala☆15Jan 1, 2019Updated 7 years ago
- Cache server :)☆32Sep 5, 2023Updated 2 years ago
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆16May 22, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Automated Jupyter notebook testing. 📙☆41Jan 25, 2024Updated 2 years ago
- write WeApp with scalajs☆19Dec 31, 2018Updated 7 years ago
- a hyper-optimized single-node(local) version of spark sql engine, which's fundamental data structure is scala Iterator rather than RDD.☆13Jun 13, 2023Updated 3 years ago
- This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python. PyJava introduces Apache A…☆49Apr 21, 2023Updated 3 years ago
- Tantivy directory implementation backed by object_store☆40Jan 22, 2024Updated 2 years ago
- ☆10May 24, 2022Updated 4 years ago
- ☆10Nov 11, 2019Updated 6 years ago
- Helpers for setting up an embedded Python interpreter☆20Oct 31, 2025Updated 7 months ago
- A slab allocator with stable references☆15Jan 23, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Serve a 1x1 GIF pixel from an AWS lambda-powered endpoint☆13Sep 7, 2017Updated 8 years ago
- ☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.☆46Mar 10, 2025Updated last year
- QueryScript is a strongly typed programming language that builds on SQL. It provides modern tooling, modularity, and integrations with po…☆39Jul 3, 2023Updated 2 years ago
- Golang driver for databend cloud☆21May 9, 2026Updated last month
- ☆20Jul 17, 2023Updated 2 years ago
- AWS Blog post code for running feature-extraction on images using AWS Batch and Cloud Development Kit (CDK).☆20Oct 28, 2022Updated 3 years ago
- An embeddable graph database for large-scale vertices and edges☆75Apr 16, 2023Updated 3 years ago
- Plugin to accelerate Spark SQL with the NEC Vector Engine.☆19Aug 15, 2022Updated 3 years ago
- xreq and xdiff tool to call or diff complicated API easily☆104Jul 18, 2025Updated 11 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Anki Overdrive API for Python☆12Oct 21, 2017Updated 8 years ago
- ☆22Mar 31, 2022Updated 4 years ago
- Spark SQL Macros provides a mechanism similar to Spark User-Defined function registration; with the key enhancement being that custom cod…☆16Mar 17, 2021Updated 5 years ago
- Unleash the performance potential of your Parquet files.☆53Feb 24, 2026Updated 3 months ago
- Run Github Actions workflows locally or on a custom backend☆17Apr 16, 2026Updated 2 months ago
- ☆12Oct 25, 2023Updated 2 years ago
- Docker image that builds a patched Apache Spark with AWS Glue support as metastore☆18Jun 8, 2024Updated 2 years ago