Wittline / wbz
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.
☆14Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for wbz
- Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture☆11Updated last year
- ☆11Updated 2 years ago
- Cost Efficient Data Pipelines with DuckDB☆46Updated 3 months ago
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated 11 months ago
- A Probabilistic Programming Language in 70 lines of Python. Code for the blog post https://mrandri19.github.io/2022/01/12/a-PPL-in-70-lin…☆17Updated 2 years ago
- ☆20Updated 2 years ago
- Demo of Hydra☆18Updated 2 years ago
- Demo on how to use Prefect with Docker☆26Updated 2 years ago
- Demo on how to use Prefect 2 in an ML project☆40Updated 2 years ago
- Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market☆55Updated last year
- Intro to Polars Tutorial☆21Updated last year
- A sample pattern for running CI tests on Modal☆13Updated 2 months ago
- Create a local dashboard to visualize and filter your GitHub feed☆29Updated 2 years ago
- Unified Distributed Execution☆51Updated last month
- Self-exploratory Streamlit app to know more about palmer penguins.☆11Updated last year
- Palantir Python SDK☆34Updated this week
- JupyterLab renderer of dagitty causal diagrams☆20Updated last year
- Collection of python scripts to demonstrate asynchronous programming in python☆11Updated 2 years ago
- Python library for interacting with Dask clusters in Saturn☆12Updated last month
- ☆18Updated 6 months ago
- A repo of Flyte-related conference talks☆13Updated 8 months ago
- Projects completed under LinuxWorld Informatics Ltd. - MLOps Training.☆12Updated 4 years ago
- Portfolio rebalancing tool for investors☆16Updated 3 months ago
- Build interactive big data apps with Altair and Vega easily using Panel + VegaFusion.☆17Updated 2 years ago
- SDSU Data Science Symposium 2024 - Docker Workshop☆39Updated 9 months ago
- DuckDB SQL Tools add DuckDB support to VSCode, and provide database schema and SQL query interfaces for the popular SQLTools extension, S…☆12Updated 4 months ago
- Functional enrichment terms aggregator☆17Updated 7 months ago
- ☆12Updated 2 years ago
- ☆13Updated last year