Select Page

Leo Tolstoy begins Anna Karenina with the following: “Happy families are all alike; every unhappy family is unhappy in its own way.” The so-called Anne Karenina principle has been applied to business, personal relationships and even statistical modeling. The underlying thesis is that systems that work perfectly resemble each other, but each instance of failure is unique.

As a data scientist at Presenso, I will explain how the Anna Karenina principle applies to Big Data and its relevance for Industrial IoT Predictive Maintenance.

Industrial machinery is comprised of multiple interconnecting components with embedded sensors.  We use the data that is generated from these sensors to monitor systems and predict when failure is likely to occur.  Machines that work without fault are the “happy families.”  They seem alike and show no signs of breakdown.

However, when the family is “unhappy” or the machine breaks down, each failure is unique and difficult to predict.

Let’s apply this to the world of Big Data.

Within data science, there are a number of approaches to Machine Learning.  With Supervised Machine Learning, the data is labeled with examples of failure and is trained to identify new incidents of failure based on these labels. In our analogy, the model is trained to recognize a new unhappy family because it resembles other unhappy families.

To apply this to Predictive Asset Maintenance, if the Machine Learning algorithm can recognize common failure patterns, then industrial plants can be warned of potential failure before it occurs.

Herein lies the problem.  Because each “unhappy family” is different from each other, there are not enough examples to train the algorithm for each type of failure. As a result, the Supervised Machine Learning algorithm lacks sufficient data to be trained and cannot detect an evolving failure.

An alternative that we use at Presenso is based on Anomaly Detection.  We do not assume that each machine failure will be similar.  Instead, we are looking for anomalous patterns.  To use the Anna Karenina principle, we are looking for behavior patterns that are unusual as a way to predict evolving asset failure.

A traditional assumption within machinery asset maintenance is that past patterns of failure can be used to predict future incidents.  The famed philosopher George Santayana stated that “those who cannot remember the past are condemned to repeat it.”   Plant technicians monitor SCADA data and are alerted as to whether sensors have breached manually-set control thresholds.  Why is this done?  Based on past experience, if these thresholds are breached, it is indicative of a fault in a machine or system.

The problem with monitoring SCADA data is that by the time a threshold has been breached it is already too late.  It is easy to identify the “unhappy family” in front of a family court, but at this point, remediation is an unlikely occurrence.

In conclusion, I do not wish to imply that the Anna Karenina Principle provides a perfect analogy for the use of historical data for failure prediction.  However, at a high level, it is a useful way of framing the problematic use of Supervised Machine Learning methodologies for downtime prediction.  If we only look for what we can recognize (the unhappy family based on simplistic criteria) then we miss what we cannot yet see: early signs of evolving degradation and breakdown.