Splittable Gzip codec for Hadoop
☆77Apr 14, 2026Updated this week
Alternatives and similar repositories for splittablegzip
Users that are interested in splittablegzip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Task Metrics Explorer☆14Apr 2, 2019Updated 7 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Jan 22, 2024Updated 2 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆191Oct 15, 2025Updated 6 months ago
- Advanced fold methods for Kotlin☆12Apr 1, 2026Updated 2 weeks ago
- Atomix Jepsen tests☆14Feb 7, 2017Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A core library for reading, transforming, filtering, and writing data records☆15Jan 17, 2026Updated 2 months ago
- ☆35Updated this week
- Code to collect and analyze traceroute data within a network topology☆28Nov 20, 2018Updated 7 years ago
- Protobuf definitions for the Liftbridge gRPC API. https://github.com/liftbridge-io/liftbridge☆15Dec 22, 2025Updated 3 months ago
- memo & blog☆17Feb 8, 2015Updated 11 years ago
- default visualizations that come packaged with the lightning viz notebook☆12Apr 18, 2016Updated 9 years ago
- Python Vector Search tutorial generated using gpt4☆12Mar 18, 2023Updated 3 years ago
- Lucene based indexing in Cassandra☆61May 3, 2016Updated 9 years ago
- ServiceFramework 示例项目☆10Apr 2, 2016Updated 10 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Generate SQL from Graphic Walker visualization DSL☆13Feb 23, 2024Updated 2 years ago
- CDAP Cube Dataset Guide☆12Aug 26, 2017Updated 8 years ago
- This repository is to help with the Partner Demonstration of the Apache Atlas project.☆30Oct 29, 2015Updated 10 years ago
- Paper: A Zero-rename committer for object stores☆20Nov 7, 2025Updated 5 months ago
- Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20☆549Apr 24, 2024Updated last year
- Snowflake Data Source for Apache Spark.☆230Updated this week
- Cascading and Scalding wrapper for HBase with advanced read features☆54Feb 11, 2020Updated 6 years ago
- Spark + Jupyer + Hive☆16Sep 22, 2015Updated 10 years ago
- ☆12Mar 12, 2021Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A library on top of either pex or conda-pack to make your Python code easily available on a cluster☆46Feb 4, 2026Updated 2 months ago
- HDFS compatible Distributed Filesystem backed Cassandra☆25Sep 17, 2015Updated 10 years ago
- Map your python dataclasses to pyspark types☆10Feb 11, 2024Updated 2 years ago
- A generic ETL framework with Spark_SQL for transforming data by constructing pipelines with Yaml/Json/Xml☆20Feb 3, 2026Updated 2 months ago
- High performance native memory access for Java.☆129Apr 6, 2026Updated last week
- ☆14Dec 18, 2025Updated 3 months ago
- API and libraries for generating travelsheds from OSM & GTFS data☆40Jul 14, 2018Updated 7 years ago
- ## Auto-archived due to inactivity. ## Simple JVM Profiler Using StatsD and Other Metrics Backends☆15Oct 3, 2023Updated 2 years ago
- ☆14May 8, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Examples of Prompt Engineering, Zero Shot Learning, Few Shot Learning and Retrieval Augmented Generation (RAG) using Hugging Face, Databr…☆16Sep 21, 2023Updated 2 years ago
- A pluggable actor system written in java leveraging modern features from JDK21+☆37Updated this week
- Run TPCH Benchmark on Apache Kylin☆22Jan 24, 2022Updated 4 years ago
- Kylo integration with PDND (previously DAF).☆19Nov 16, 2022Updated 3 years ago
- Hadoop Cluster with security☆13Nov 21, 2021Updated 4 years ago
- Port of Twitter's Scala JVM-profiler to Java☆15Sep 28, 2022Updated 3 years ago
- Apache Ambari Infra is a sub project of Apache Ambari.☆22Mar 11, 2025Updated last year