airscholar / RealtimeStreamingEngineering

This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.
32Updated last year

Alternatives and similar repositories for RealtimeStreamingEngineering:

Users that are interested in RealtimeStreamingEngineering are comparing it to the libraries listed below