<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=239686353408221&amp;ev=PageView&amp;noscript=1">

Arundo at Strata 2017 - Data Requirements

Alexandra discusses the four major types of data required to develop predictive models for heavy industrial equipment

 

When we in Arundo are going to do a data science project with the company, we need four major groups of data. Each of these data sources are owned and controlled by different people in the organization. The first one is sensor data. It's hard to build a predictive model without having this numerical data that's so rich and so interesting.

 

The people that own the sensor data that we see in all of the companies tend to be control room engineers. Control room engineers, they're equipment specialists, they're very familiar with what sensors indicate the health of an equipment so they're usually the people that are monitoring this day to day and are familiar with how you access it.

 

Then, we have the asset manager. Asset managers, they are responsible for either a rig or a ship and they understand how their asset is broken down so how many compression trains they have, how many production plants, that type of thing.

Downtime data. Downtime data tells you about the money. If you have an oil rig that has had unplanned downtime last year for one week, it's usually somebody, like a process engineer, somebody in business, a white-collar person, that's sitting onshore that has access to these numbers.

 

Then, maintenance engineers. You've probably heard of maintenance engineers now that preventive maintenance is becoming a really hot topic. These engineers, they are very familiar with these failure notifications work quarters and they use that to help them plan maintenance for different rigs or ships. All of these people collectively are the subject matter experts. They all are what I would call the puzzle masters. You can't build a successful data science model in these industries without speaking to them, consulting them. They all have the knowledge about how the data interrelates but because they're so focused on different parts of the organization and because the data is siloed, the data's just never been joined into this cohesive landscape.

 

There are operational challenges that are felt throughout the organizations by these people. The control room engineer, they maybe want to answer questions about like, "How could I overlay my sensor data with failure notifications so that they can start investigating failure modes and effects?" Asset managers, they want to compare how their asset is behaving compared to others. They want to know, "Is my asset better or is it worse?" They want to see like, "Is my pump from Schlumberger? How does that compare with the pump from Aker Solutions on this other rig?"

 

The process engineers, they want to have a very clear-cut view of what their highest risk assets are and what the equipment that's causing this downtime. Then, the maintenance engineers, they do a lot of failure modes and effects analysis and they don't need necessarily our help in doing that, but what they would like is having a diagnostic model that helps them plan their maintenance in a more efficient way so an app or AI tool.

 

Then you have us, the data scientists, we want to help our customers focus on the highest value use cases. We want to be able to provide proof to executives but before we can do that, we need to have all the data together so that we can start building those models.