Building a Real-Time Dashboard for Streaming Data with Python

In today's data-driven world, the ability to process and visualize streaming data in real-time has become increasingly valuable. This tutorial will guide you through creating a professional real-time dashboard using Python, demonstrating how to capture, process, and visualize streaming data as it arrives.

Introduction to Real-Time Data Processing

Unlike traditional batch processing where data is collected and analyzed periodically, real-time data processing allows you to analyze and visualize information as it's generated. This capability is essential for:

  • Monitoring IoT sensors and devices
  • Tracking financial market movements
  • Analyzing user behavior on websites and applications
  • Monitoring system performance metrics
  • Detecting anomalies and responding to events instantly

Architecture Overview

Our real-time dashboard system consists of three main components:

  1. Data Producer: Simulates IoT sensor data and sends it to a message broker
  2. Message Broker: Handles data streaming between components (using Kafka)
  3. Dashboard Application: Consumes, processes, and visualizes the streaming data

Here's a visual representation of our architecture:


Core Technologies

Let's explore the key technologies we'll use:

Confluent Kafka

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, and data integration. We'll use the confluent-kafka Python client, which offers several advantages:

  • High throughput and low latency for real-time applications
  • Fault tolerance and durability for reliable data processing
  • Scalability to handle growing data volumes
  • Strong ecosystem integration with many data tools
Why confluent-kafka instead of kafka-python? 

The confluent-kafka package is a Python wrapper around the high-performance C library librdkafka. It provides better performance, reliability, and fewer dependency issues compared to the pure Python implementation in kafka-python.

Streamlit

Streamlit is an open-source Python library that makes it easy to create custom web applications for data science and machine learning. Key benefits include:

  • Create interactive dashboards with minimal code
  • Native support for Python data libraries (Pandas, NumPy, etc.)
  • Built-in components for visualization and user interaction
  • No frontend web development experience required

Plotly

Plotly is a powerful graphing library that creates interactive, publication-quality visualizations. We'll use it because:

  • It provides interactive charts that users can zoom, pan, and hover over
  • Supports a wide variety of chart types (line charts, scatter plots, gauges, etc.)
  • Works seamlessly with Streamlit
  • Offers high customization capabilities

Setting Up Your Development Environment

Before we begin coding, let's set up our environment:

# Create and activate a virtual environment
python -m venv dashboard-env

# On Windows: dashboard-env\Scripts\activate
source dashboard-env/bin/activate  

# Install required packages
pip install streamlit pandas plotly confluent-kafka

Part 1: Creating the Data Producer

Our first component is the data producer. In a real-world scenario, this might be IoT devices, user activity trackers, or financial data feeds. For this tutorial, we'll create a simulator that generates realistic sensor data.

Create a file named sensor_producer.py: (Click to download the file)


1.1 Dashboard Application (dashboard.py)

Now, let's create the dashboard that will visualize our data:

Click on dashboard.py to download the data

Part 2: Setting Up Kafka (Message Broker)

For this tutorial, we will assume you have Apache Kafka running locally. If not, you can follow the tutorials below to install and run it.

Installing Apache Kafka using Docker is a great way to get started, especially if you're new to Kafka. Docker makes it easy to set up and manage Kafka without worrying about complex configurations. Below is a step-by-step guide to help you install Kafka using Docker.

Prerequisites

  • Docker: Install Docker from the official website
  • Docker Compose: Docker Compose is usually included with Docker Desktop. If not, you can install it separately
  • Start Docker from your Desktop and ensure it is running

2.1 Create a docker-compose.yml file

Docker Compose allows you to define and run multi-container Docker applications. We'll use it to set up Kafka and its dependencies (like Zookeeper).

On Unix

1. Create a new directory for your Kafka setup:

mkdir kafka-docker

cd kafka-docker

2. Create a docker-compose.yml file in this directory:

touch docker-compose.yml

3. Open the docker-compose.yml file in a text editor and paste the following configuration:


services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  kafka:
    image: confluentinc/cp-kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    depends_on:
      - zookeeper

This configuration sets up two services:

  • Zookeeper: Kafka uses Zookeeper for managing cluster metadata.
  • Kafka: The Kafka broker itself.

On Windows

Step 1: Create the docker-compose.yml file

Option 1: Using Notepad

1. Open Notepad (or any text editor like Notepad++, VS Code, etc.).

2. Copy and paste the docker-compose.yml content into the editor:

services: zookeeper: image: confluentinc/cp-zookeeper:latest container_name: zookeeper ports: - "2181:2181" environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 kafka: image: confluentinc/cp-kafka:latest container_name: kafka ports: - "9092:9092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 depends_on: - zookeeper


3. Save the file as docker-compose.yml:

  • In Notepad, click File > Save As.
  • In the "Save as type" dropdown, select All Files.
  • Name the file docker-compose.yml (make sure it doesn’t save as docker-compose.yml.txt).
  • Save it in a directory of your choice (e.g., C:\kafka-docker).

Option 2: Using Command Prompt or PowerShell

1. Open Command Prompt or PowerShell.

2. Navigate to the directory where you want to create the file:

3. Use the echo command to create the file:

echo. >> docker-compose.yml
echo services: >> docker-compose.yml
echo. >> docker-compose.yml
echo   zookeeper: >> docker-compose.yml
echo     image: confluentinc/cp-zookeeper:latest >> docker-compose.yml
echo     container_name: zookeeper >> docker-compose.yml
echo     ports: >> docker-compose.yml
echo       - "2181:2181" >> docker-compose.yml
echo     environment: >> docker-compose.yml
echo       ZOOKEEPER_CLIENT_PORT: 2181 >> docker-compose.yml
echo       ZOOKEEPER_TICK_TIME: 2000 >> docker-compose.yml
echo. >> docker-compose.yml
echo   kafka: >> docker-compose.yml
echo     image: confluentinc/cp-kafka:latest >> docker-compose.yml
echo     container_name: kafka >> docker-compose.yml
echo     ports: >> docker-compose.yml
echo       - "9092:9092" >> docker-compose.yml
echo     environment: >> docker-compose.yml
echo       KAFKA_BROKER_ID: 1 >> docker-compose.yml
echo       KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 >> docker-compose.yml
echo       KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 >> docker-compose.yml
echo       KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 >> docker-compose.yml
echo     depends_on: >> docker-compose.yml
echo       - zookeeper >> docker-compose.yml


This will create the docker-compose.yml file

3. Start Kafka and Zookeeper

3.1 Step 1: Start Docker Desktop

  • Open Docker Desktop on your Windows machine.
  • Wait for Docker to fully start (you’ll see the Docker whale icon in the system tray)
  • Open Command Prompt or PowerShell
  • Navigate to the directory where your docker-compose.yml file is located:

Step 2: Run 


docker-compose up -d


This command starts both Zookeeper and Kafka in the background. You should see output indicating that the containers are being created and started

Docker will:

  1. Download the required images (if not already downloaded).
  2. Start the containers for Zookeeper and Kafka.

Step 3: Verify the Containers

  • Check if the containers are running:


docker ps


You should see two containers: zookeeper and kafka

3.2 Step 4 Create a Kafka topic

In a regular PowerShell window, run this command to create the Kafka topic that our application will use:

docker exec kafka kafka-topics --create --topic sensor_data --bootstrap-server
localhost:9092 --partitions 1 --replication-factor 1

This command creates a topic named sensor_data that our producer will write to and our dashboard will read from.

Alternative method to create Kafka topic 

Kafka stores messages in topics. Let's create a topic called test-topic

Step 1: Open a Shell Inside the Kafka Container

  • Open Command Prompt or PowerShell
  • Run the following command to open a shell inside the Kafka container:
docker exec -it kafka /bin/bash

This will give you a terminal inside the Kafka container.

Step 2: Create a Kafka Topic

  • Inside the Kafka container shell, run the following command to create a topic called test-topic:

kafka-topics --create --topic test-topic --bootstrap-server localhost:9092
--partitions 1 --replication-factor 1
  1. This creates a Kafka topic with 1 partition and a replication factor of 1 (suitable for local development).

  • Verify that the topic was created:
kafka-topics --list --bootstrap-server localhost:9092

You should see test-topic listed.

4 Running the Application

Now that we have our code ready and our infrastructure running, let's start our application!

4.1 Start the Producer

Open a new PowerShell window and change into the file directory and run:


python sensor_producer.py


You should see output indicating that data is being generated and sent to Kafka:

Produced: {'timestamp': '2025-03-14T10:42:15.123456', 'temperature': 22.7, 'humidity': 45.3, 'pressure': 1003.2, 'cpu_usage': 56.7, 'memory_usage': 42.1}
Message delivered to sensor_data [Partition: 0]

4.2 Start the Dashboard

Open another PowerShell window and run:


streamlit run dashboard.py

Your browser should automatically open and display the dashboard. You should see:

  • Real-time charts updating with the latest sensor data
  • Gauge visualizations showing current CPU and memory usage
  • Statistics comparing current values to averages

The dashboard will continuously update as new data flows from the producer through Kafka.

5. Understanding What's Happening

Now that everything is running, here's what's happening in the system:

  1. Data Production: The sensor_producer.py script generates simulated sensor data every second
  2. Message Publishing: Each data point is published to the Kafka topic 'sensor_data'
  3. Message Storage: Kafka stores these messages and makes them available to consumers
  4. Data Consumption: The dashboard.py application consumes the messages from Kafka
  5. Visualization: The dashboard processes the data and updates the visualizations in real-time

6. Shutting Down

When you're done, you can shut everything down:

  1. Stop the producer and dashboard by pressing Ctrl+C in their respective windows
  2. To stop the Kafka and Zookeeper containers, run the following command in a Command Prompt or PowerShell window (from the directory where your docker-compose.yml file is located):  


docker-compose down


     3. If you want to remove the containers and their data (volumes), use:


docker-compose down -v


Conclusion

Congratulations! You've successfully built and run a complete real-time data visualization system. This architecture can be adapted to many real-world scenarios where streaming data needs to be processed and visualized in real-time.


Comments

Popular posts from this blog

The Convergence Revolution: How Foundation Models, Multimodal AI, and Computational Biology Are Reshaping Data Science in 2025

Towards Sustainable AI: Exploring Cyclical and Adaptive Approaches