Part3: Use case on Migrating real time IoT data analytics to Edge

Continuing from Part2 , as mentioned in that post, I will explain a real world use case and how we approached it in this article.

Our platform was an industrial asset condition monitoring system developed using GcP IoT Core and other key GCP products. At the factory level, we had our own proprietary industrial gateways which interacted with machinery and equipment over standard industry protocol. Few of the equipment shared a single gateway while a few equipment had their own dedicated gateway. Industrial gateways send data to the IoT core. From the IoT core, data was pushed to PubSub. PubSub had multiple listeners. However key being

a. Dataflow job for real time data analytics, windowing etc

b. Cloud function to persist the data into a data warehouse.

Currently each customer has their own instance of the platform. Though the solution was offered on demand and flexible billing model, it is not a single runtime that served multiple customers.

Solution was developed a couple of years back. Since the edge frameworks were not mature than, a Edge first approach was not adopted during that time. With advent of K3S and mini kube, the edge ecosystem is much more reliable now. One factor that made our migration to edge easy was the fact that our gateways ran linux OS and supported virtualization.

As we decided to re-architect to move from real time cloud driven decision making to edge driven decision making, below are the key activities we identified at key high level activities.

  • Core edge platform identification team.
  • Edge platform and administration development team.
  • AI team.
  • Data Engineering team.
  • Solution and services team.

We constituted a dedicated scrum team and the architect oversaw the decision making and orchestrated the decision making. One primary reason was to have each team develop and recommend best practices for their stack without getting influenced by others. Architect drove the discussion when any concerns cutting across multiple tracks were required.

Below are the initial foundation priorities/key deliverables which acted as a foundation

Core Edge platform Identification: Identify the base platform to use for Edge capabilities. While Azure gives Edge management capabilities as part of its IoT hub, Google did not offer one. So the team was instructed to do PoC/study on different edge frameworks and choose the best one. After analyzing lot of frameworks, we decided to use ioFog.

AI Model: While the initial AI service was running in cloud, with the new architecture it will be moved to edge. Only the model generation will happen at the cloud. While the AI model per se can be deployed at cloud and can be reused, this team was entrusted with the responsibility to see in future if multiple models can be developed tailoring to different equipment etc so that it may lead to higher accuracy specific to the asset type.

Data pipeline and Application redesign: This is one of the toughest phases with a lot of decision making, and not all decisions are technical. As the edge will filter out noise and send only the aggregated data, how does the dashboards which streamed real time data. We adopted following strategies

  • Main dashboard and alerting was tweaked to include to show only the alerts, notification, decisions and windowing output done by the edge.
  • An application service will run at the edge side. If a user still needs to access real time data, then request will be forwarded to the edge instance(Since all gateways are present in same VPC, they are discoverable)
  • Edge uploads daily telemetry data as a nightly job to the google storage bucket. This data was used by further model development

Data pipeline changes : New pipeline specific to edge needs to be developed. It involved designing everything from data acquisition to invoking AI, real time control etc.

Further in this article, I will explain in detail about how we did the pipeline changes, architecture and design decisions we took etc

Our goal from the beginning was to adopt as many no-code options as possible. We found StreamSets a good fit for a configurable pipeline. Its desktop UI not only enabled us to design the pipelines, but also enabled us to export a pipeline as a docker image. Once a docker image is exported, it can be easily deployed. However, it should be noted

  • If a decision based control is required, for e.g as part of Pipeline an AI service is invoked and based on response some decisions need to be taken, it is difficult.
  • Any customizations that are required needs to be written using jython.
  • Even if you manage to find a jython resource, it is tough to integrate the customizations along with CI/CD processes.

For the reasons mentioned above, we used Streamsets predominantly as an orchestration engine and most of the logic is written in custom services which are exposed as HTTP and invoked by StreamSets. Having said that, StreamSets comes with a lot of useful features and options and can be easily used as per need. For an organization to get started, understand and quickly see the benefits, StreamSets provide a great value.

As mentioned earlier, not all the data is pushed to cloud real time. Only the inference and key decisions are sent from the cloud. However, for AI modelling, data is required and the more the data the better it is. As a tradeoff, we saved all the telemetry data in a local database (MongoDB)with a retention time of 48 hours. We had a nightly job which will get the data for the previous day, write it into a file and upload the file to the cloud storage. With this we are achieving the following

  • Async file uploads during non peak hours
  • Detailed data becomes available for data modelling.

We chose MongoDB because StreamSets provides out of the box integration with MongoDB. However other databases(e.g SQLite)can also be used.

Also while deploying services, make sure that a private network is formed amongst the container and each of them is visible to other. Or other option is to use network=host and lookup using local IP.

IoT Solution Architect