databrickslabs / transpiler
SIEM-to-Spark Transpiler
☆42Updated 6 months ago
Related projects: ⓘ
- A Table format agnostic data sharing framework☆36Updated 7 months ago
- IOC matching for incident responders, threat hunters, detection engineers, and security engineers.☆14Updated 3 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆40Updated 7 months ago
- ☆24Updated 5 months ago
- Time series knowledge graphs for cybersecurity☆18Updated 3 months ago
- Unity Catalog UI☆40Updated 2 weeks ago
- Analyze Zeek IDS data with ksqlDB running on Confluent Platform via Docker on your laptop. Or spin up an arbitrary number of AWS hosts, …☆11Updated 2 years ago
- Java implementation for performing operations on Apache Iceberg and Hive tables☆16Updated 2 months ago
- ☆66Updated 8 months ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆141Updated 2 weeks ago
- Delta Lake Website☆24Updated this week
- Simple project to expose a catalog over REST using a Java catalog backend☆103Updated this week
- Extensible Rules Engine for custom Dataframe / Dataset validation☆134Updated 4 months ago
- Pythonic Iceberg REST Catalog☆60Updated last week
- ☆31Updated 2 years ago
- A tool to validate data, built around Apache Spark.☆101Updated last month
- In-Memory Analytics for Kafka using DuckDB☆63Updated this week
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆210Updated this week
- Yet Another (Spark) ETL Framework☆18Updated 10 months ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆43Updated 5 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆91Updated this week
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆64Updated last week
- Multi-hop declarative data pipelines☆86Updated last month
- Amundsen Gremlin☆20Updated 2 years ago
- The Internals of Spark on Kubernetes☆71Updated 2 years ago
- ☆104Updated last year
- Command line tool used for generating events corpus dynamically given a specific integration☆21Updated 2 weeks ago
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆51Updated this week
- A Minimalistic Rust Implementation of Delta Sharing Server.☆79Updated last month
- Open, Multi-modal Catalog for Data & AI, written in Rust☆72Updated 2 months ago