Learn what anomaly detection is, how it may apply to your business, and the key concepts and technical constraints you should be aware of.
As an increasing number of industrial companies gain experience with advanced analytics, the term "anomaly detection" is becoming more common in discussions across a range of engineering and operating roles. People in a variety of jobs, ranging from maintenance and reliability to production and process management, are starting to hear the term with increasing frequency.
This article describes what anomaly detection is, how it may apply to your business, and the key concepts and technical constraints of which you should be aware as you start working with anomaly detection systems.
WHAT ARE ANOMALIES?
In the data science world, an "anomaly" refers to an unexpected or abnormal event. In an industrial operations context, such events could involve incidents such as:
- Sudden dips or spikes in one or more sensor values that are unusual for “expected” operating conditions.
- Unusual images in a video feed, such as leaks, wildlife, or unexpected people.
- In almost any set of time series data, a sudden deviation that’s outside of the range you might expect given normal conditions.
Anomalies are particularly important to understand in industrial operations. Such operations, and their component systems, processes, and individual pieces of equipment, are typically engineered to operate within specified tolerances. In general, engineers and operators have a very good understanding of how things are supposed to work. Anomalies could suggest that something could be going wrong.
WHY SHOULD YOU CARE ABOUT ANOMALIES?
Most complex operations are well-instrumented, with a set of threshold-based rules and alerts built into a longstanding control process. However, alarm fatigue is a real issue in many industrial environments. Moreover, threshold-based analysis of individual sensors has declining usefulness as the number of sensors increase, as human operators are unable to quickly understand patterns across a large number of sensors.
In addition, some important values are difficult or cost-prohibitive to directly measure or monitor, such as extremely high temperatures or lack of consistent instrumentation across large fleets of similar equipment, such as pump systems. In such cases, data-driven interpolation, estimation, and/or combination with simulation or other physics-based models may be necessary to even find a value of interest, let alone observe potential anomalies.
WHAT ARE SOME STRATEGIES FOR DETECTING ANOMALIES?
Though engineers, operators, and executives rarely drive specific techniques for anomaly detection systems, it’s helpful to understand some of the strategies that data scientists might use, as the implementation of any given strategy may have underlying costs or other business ramifications. There are three common strategies most often used in anomaly detection:
- Distance-based detection – measuring the deviation of certain values from the path of normal values. For example, the truck engine coolant temperature you have been monitoring has an average value of 85 degrees Celcius in the past 3 hours. You expect the next engine coolant temperature to fall around 85 with certain interquartile range tolerance. Any coolant temperature outside of the tolerance window may indicate a change of condition in the engine.
- Cluster-based techniques – measuring the deviation of certain signals from a cluster of related signals at points in time. For example, If the temperature and power are spiking while pressure and flow remain within normal ranges, something unexpected may be occurring.
- Advanced techniques – using a deep neural network to "segment" the time-series data. Time windows of sensor data are fed to the network as inputs and predictions are made at each timestamp inside of the window to identify any anomaly. For example, a gasoil plant has 19 sensors monitoring different equipment at different locations. You divide the data to time windows of 3 hours and feed to a type of segmentation network called a UNet, and the network will output a window of the same length as the input sensor data time window, and values to classify if a particular timestamp inside of the time window is anomalous or not.
Distance-based algorithms and cluster-based algorithms are some of the easier algorithms to build and can be done in an unsupervised way - meaning they don't necessarily require any labeled data. These algorithms are simple but can be very effective if done correctly with sufficient subject matter knowledge. Such approaches may make sense for you if the time requirement or cost of labelling data is outside of your budget or immediate resource availability and the timeline for model development is relatively tight.
Neural network models are more difficult to build and require much more computational resources. Most neural networks are supervised and require labeled data to guide the model training process. The advantage of neural network is that it requires little domain knowledge about the subject of the problem and has the potential to achieve incredible accuracy, but the tradeoff is fewer model insights and control. Models evolve over time in the course of a data science project. Unsupervised learning algorithms are usually deployed at early stages of the project to build a baseline performance. As more data is required and labeled over time, supervised learning can be experimented with to improve model performance from the baseline. The implication is that more advanced approaches may provide more accuracy, but may also require a large amount of historical data that is well understood in order to get started. Such approaches may make sense when the cost of false positives (i.e., identifying an anomaly that isn’t actually an anomaly) are relatively high, and high-quality datasets are relatively easy to access.
Each of these anomaly detection approaches have different pros and cons. However, for engineers, operators, and executives, the main question is constant: what will you do differently given the results of this analysis? It doesn’t matter how fast, accurate, or interesting an anomaly detection may be if there isn’t a clear business response that has measurable financial or strategic impact.
Anomaly detection is a term gaining currency in industrial operations. As you start implementing or improving such systems in your business, it’s helpful to understand the limits, opportunities, benefits, and challenges of such an approach.
ABOUT THE AUTHORS
Amitav Misra is General Manager Americas at Arundo Analytics. He has twenty years of industrial and technology experience from principal investment, management consulting, and functional and general management roles. He joined Arundo in June 2017.
Jason Hu is currently an associate data scientist at Arundo Analytics. At Arundo Jason mostly focus on using computer vision techniques and time-series analysis to solve industrial challenges. Some of the projects he has done include predicting emission levels of a biomass plant, failure prediction of heavy equipment, and digitization of industrial diagrams. Jason has a BS degree in Petroleum Engineering and MS degree in Energy Resources Engineering.