Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)
☆15Dec 16, 2021Updated 4 years ago
Alternatives and similar repositories for data-engineering-challenge-th
Users that are interested in data-engineering-challenge-th are comparing it to the libraries listed below
Sorting:
- Challenge Data Engineer☆25Jun 13, 2022Updated 3 years ago
- Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)☆13Jun 13, 2022Updated 3 years ago
- A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler…☆13Jun 29, 2022Updated 3 years ago
- Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag☆23Sep 19, 2022Updated 3 years ago
- Downloads OCDS data and stores it on disk☆14Feb 20, 2026Updated last week
- Tablas de código postal argentino☆14Jul 16, 2017Updated 8 years ago
- Docker Blueprint for a GeoNode Installation☆13Jul 8, 2025Updated 7 months ago
- Migrated out of GitHub☆11Jan 10, 2021Updated 5 years ago
- A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large …☆17Feb 6, 2026Updated 3 weeks ago
- Estándares de trabajo del equipo Datos Argentina.☆12Apr 22, 2021Updated 4 years ago
- Docker-compose samples☆15May 10, 2024Updated last year
- This repository is a part of http://digiwhist.eu/ project. It contains source codes of a data processing system that ensures a collection…☆11Mar 27, 2023Updated 2 years ago
- R library to obtain indicators of SCLdata collections☆13Feb 3, 2026Updated last month
- Convert audio file to text☆14Jun 18, 2019Updated 6 years ago
- Doom Port for Touchscreen Kindles☆14Jul 17, 2025Updated 7 months ago
- ☆11Apr 26, 2022Updated 3 years ago
- Telegram bot to use ChatGPT with vocal commands☆17Mar 8, 2023Updated 2 years ago
- Ultra Fast Multi-Modality Vector Database☆18Feb 21, 2024Updated 2 years ago
- Leaflet+Bootstrap Template☆17Jul 25, 2018Updated 7 years ago
- Argentina Covid19 API (API para el seguimiento de la pandemia del coronavirus en Argentina consumible por aplicaciones)☆18Dec 13, 2022Updated 3 years ago
- A Flat Data GitHub Action demo repo☆20Feb 11, 2022Updated 4 years ago
- python-docx run manipulation☆21Apr 17, 2021Updated 4 years ago
- The power of GPT4All mixed with the power of pandas☆23May 5, 2023Updated 2 years ago
- Price Crawler - Tracking Price Inflation☆192Jun 23, 2020Updated 5 years ago
- Flatten/Explode JSON objects☆21Feb 5, 2026Updated 3 weeks ago
- Implement various data structures from scratch in Python 3☆22Aug 2, 2013Updated 12 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆28Jun 13, 2022Updated 3 years ago
- Python ETL demo for Hackforge☆32Oct 11, 2023Updated 2 years ago
- Migrate data from SQL to NoSQL easily☆42Apr 20, 2025Updated 10 months ago
- SCD Merge Wizard is an application which will help you generate T-SQL statement for merging data from two tables into one table in minute…☆44Sep 4, 2024Updated last year
- Dockerizing an Apache Spark Standalone Cluster☆42Jun 29, 2022Updated 3 years ago
- Weekly Data Engineering Newsletter☆95Jul 14, 2024Updated last year
- Tools for generating CSV and other flat versions of the structured data☆109Dec 16, 2025Updated 2 months ago
- Airflow training for the crunch conf☆105Oct 31, 2018Updated 7 years ago
- RStudio addin for formatting Rmarkdown tables☆113Oct 20, 2022Updated 3 years ago
- Documentation of the Open Contracting Data Standard (OCDS)☆149Jan 6, 2026Updated last month
- Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.☆206Jun 26, 2025Updated 8 months ago
- Natural language Pandas queries and data generation powered by GPT-3☆200Apr 13, 2024Updated last year
- Fast iterative local development and testing of Apache Airflow workflows☆205Feb 21, 2026Updated last week