Thoughts on Large Scale Internet of Things Information Lifecycle Architecture
The Internet of Things technological evolution is gathering momentum as intensive innovation connects billions of devices into intelligent, pervasive computing systems and network. An important element of IoT is the capability of a system to examine the data generated from its processes and use those observations to drive business improvement. There is quite a lot of effort at present to define efficient and effective information architectures to handle the massive volumes of data to be created by the billions of low cost sensors and connected smart mobile devices. But since IoT is still an evolving technology, there is little consensus on exactly what attributes should be considered crucial in these architectures, especially in solutions requiring massive scale and tight security.
"Consensus on the high level architecture is in progress, but the challenge in IoT lies in breaking down the architecture into components and considering its ability to push critical boundaries"
Most Agree on the High-level Elements of an IoT Architecture
Most IoT players across the spectrum seem to have converged on five elements of the Internet of Things data lifecycle:
1. Data Acquisition encompasses the hardware— the smart devices— that capture data from interactions with the environment, with other machines, with humans or other living things and make it available for transmission over a network. Sensors are the nerve ends of the Internet of Things, and a growing assortment of devices are collecting data. There are cases where historical data may be analyzed to provide business practice improvements– not all data needs to originate from sensors.
2. Data Transport takes the acquired data from various sensors and moves it over a network for aggregation and analysis. As the name implies, the final stages of data transport should flow over Internet protocols like TCP/IP and be transformed into Internet friendly data formats like REST or XML.
3. Data Aggregation is responsible for the transport and delivery of aggregated output data to designated consumers. Data consumers can include databases, on-site services, analytics services, enterprise service buses, third-party cloud services and similar repositories.
4. Data Analysis takes the aggregated data and turns it into operational insights by applying context specific algorithms, rules, and predictive models. Mostly everyone agrees that IoT analytics should include at least basic feedback capabilities so that the predictive models get better over time.
5. Data Actions is the ability to act on the insights gleaned from the analysis of IoT data. The types of actions range from graphical representation to humans who then take manual actions, to fully autonomous systems that can take orchestrated actions to improve operations and even prevent hazards.
There is general consensus as well, that overlaying all of these elements is the need for security, data governance, and systems management of the infrastructure.
But the Real Challenges are in the Details
Consensus on the high level architecture is in progress, but the real challenges to successfully implementing IoT come in the details, and that requires breaking down the architecture into components and considering some real-world use cases that push the critical boundaries.
Consider a relatively simple use case of environmental monitoring of an office building assuming that there are sensors installed throughout the building to monitor temperature and carbon monoxide levels; these are the Acquire elements in the high level architecture. There are many technology protocols that may be used to transport the data, including Zigbee, Bluetooth Low Energy, Wi-Fi, which will require a device gateway to translate to the wider Internet; this part of the architecture constitutes the Transport elements. From the gateways, data is communicated through an Aggregation layer to a cloud based Analysis engine which in turn integrates to a workflow engine that provides the Actions in response to changes in the monitored environment. If a room temperature gets too hot, then the analysis will detect it based on sensor data and an action will be initiated to turn up the A/C.
Multi-tiered Analysis and Aggregation
All of this is a pretty basic concept, but there are some details that need to be considered. First, consider the nature of data coming from IoT monitored systems and the desired analysis of that data. If there is a network disruption between the on-premise gateways and the Internet aggregation and analysis services, then visibility into the monitored environment will be lost. Maybe that is not such a big deal if someone’s office gets a little warm, but if the carbon monoxide sensors are detecting a problem, then the situation could be much more serious. There clearly needs to be the capability to run some level of analysis locally, most likely on the gateways, to achieve resiliency against faults that disrupt the overall functioning of the IoT solution up to a cloud or a remote data center where the heavy lifting analysis and actions are handled.
Likewise, the aggregation element needs to be distributed so that some processing can take place closer to the source in order to be decoupled from the cloud. By doing this, data transmission volume can be substantially reduced and richer data can be available to the distributed analytics capabilities described above.
For example, assume that aggregation of temperature data is happening within the IoT gateways installed on each floor of a high-rise office building for the purpose of reporting and controlling the temperature throughout the building and measuring the average temperature of the entire building to analyze thermal efficiency of different construction methods. These gateways monitor temperature data from dozens of sensors installed on each floor, average that data together, and communicate the average up to the next tier gateway that ties all of the individual floor gateways together. If a central multi-tenant environmental monitoring solution is deployed into a cloud then the volume of raw temperature readings from the tens of thousands of sensors would be quite large. But the multi-tiered approach for aggregation will reduce the volume to just floor averages at the first tier, building averages at the second tier, and maybe city block averages at the third tier. Specific use cases will determine how granular the required data must be for a specific solution, but the multi-tiered approach provides a lot of flexibility to filter down to just the critical data that needs to be sent up to the cloud.
Maybe even more powerful is the capability to perform some basic analytics at each tier for close-to-the-source decisions and optimizations that are decoupled from the overall cloud solution. Simple decisions to close or open vents to automatically distribute air more efficiently can be made within analytics rules and models that are pushed out to the gateways thereby eliminating the need to perform this micro level of decision making all of the way into the cloud and it also provides a degree of resiliency so that relatively simple decisions can be made even if connectivity to the cloud is disrupted.
However, there are implications to the implementation of IoT by adopting the multi-tier approach to aggregation and analysis. The industry seems to think of IoT gateways as little more than an evolution to the cheap Internet Routers that we have in our homes. Supporting the multi-tier approach would require more powerful and flexible gateways that can be remotely configured for local aggregation and analysis. This implies robust manageability, governance, and security out to the gateways as well since the sensitivity of the data tends to increase as the levels of aggregation increase. To provide maximum flexibility, the gateway should enable custom analytics and action code to be downloaded and managed from the cloud using a standardized operating model that provides for component lifecycle management
In-band vs Out-of-band Analysis of Data
The ideal data lifecycle for Internet of Things needs to address two different data analysis models. The short horizon model, what we term the In-Band path, is required for quick analysis and actions typical of hazard avoidance and threat mitigation use cases. For instance, if data is retrieved from a handful of sensors that indicate a pending equipment failure in a factory, the decision to shut down that equipment needs to happen in very short order. As the lifecycle model picture below indicates, the aggregation of incoming data is common between the two paths, but the In-Band path executes analytics and drives actions based on quick analysis of the input stream of data. It does not include as inputs historical data that is likely stored in persistent data store like a relational database. The in-band path is all about speed and flexibility to deploy the analysis rules at any tier in the architecture; i.e. turn off the boiler if the carbon monoxide levels exceed a threshold.
The longer horizon model, the Out-of- Band path, addresses use cases where longer term trend data drives the analysis. The analysis would normally be executed against a much larger dataset from a relational database. The Out-of-Band path is ideally suited for cloud hosting since the compute resources would be more substantial and the dataset much more distributed; i.e. how much energy is the plant using on a typical summer day.
Feedback from Out-of-band to Multi-tiered In-band Analysis Lifecycle
As described above, the In-Band analysis path does a lightweight analysis of sensor data to enable quick actions. The analysis will typically be a comparison of key metrics to defined threshold values where actions are taking when metrics are out of prescribed bounds. But having the insights to know how to set these thresholds can be difficult to obtain.
This is where the feedback loop between the Out-of-Band and In-Band analysis path comes into play. It is the long horizon out-of-band analysis that has the history to determine appropriate thresholds and the feedback path can then push these updated thresholds to the multi-tiered in-band analysis to dynamically change operating thresholds as the operational environment changes.
These are just a few examples of how the current simplistic thinking of today’s data flow architectures for Internet of Things will prove to be insufficient as real-world solutions challenge us with complexity, scale and security beyond anything that we have seen in IT to this point.