Corrosion presents a significant threat to the integrity of many facilities. The key to mitigating the threat of corrosion is accurately predicting the corrosion rates of your assets. Leveraging a model that combines the computational strength of data science and the input and validation of Subject Matter Experts (SMEs) can help your facility estimate corrosion rates more accurately.
Corrosion rate estimation is typically performed by SMEs who use historical process and equipment data, along with industry-standard tools and standards, to produce results. Although this analysis is helpful, this process has some limitations, such as each facility having unique corrosion profiles due to different environmental conditions, maintenance and operation practices, and other factors. In addition, SME-created corrosion models often lean on the conservative side.
However, recent technological advancements are now providing facilities with access to an unprecedented amount of data —an amount that is almost impossible for a human or a team to analyze adequately. These large volumes of data can now be analyzed quickly and efficiently through the power of machine computation. By combining the knowledge of an SME with the analytical power of a machine, facilities can optimize suggested inspection tasks to create efficiencies and reduce costs.
Developing the Model
The steps below demonstrate what a data-driven process can look like. We first begin with cleansing the data. After the data has been cleansed, it can then be fed into the machine. Once the machine has completed its analysis, SMEs can then review the results to verify they make sense.
1. Cleanse the Data
Before beginning any model, statistical tools and methods are used to cleanse the data. This is a basic but often overlooked step that, if skipped, can skew the results and provide an inaccurate model.Often, data is plagued with quality issues that can lead to bad decision-making. By presenting the machine with a cleaner data set, it will be able to make better, more accurate predictions. Data cleansing refers to the process of preparing data for analysis by removing or modifying data that is incorrect, incomplete, irrelevant, duplicated, or improperly formatted. If we don’t cleanse the data, it creates a“garbage in, garbage out” scenario.
2. Feed Data to the Machine
Once the data is prepared, it can be fed into the machine to train the machine to understand how corrosion rates manifest in the field. This is done through supervised machine learning, where the machine is fed data examples to learn patterns. For example, the machine is given operating and design data associated with circuits, such as temperature, pressure, stream constituents, metallurgy, and observed corrosion rate. We feed the machine more data examples like this, and as the machine is exposed to the data, it starts to learn relationships between how those pieces of data correlate with the corrosion rate observed in the field.
After we’ve fed the machine all these examples, it will learn how and, in most cases, why the corrosion rate is presenting as it is. Then, we can use this to make predictions on areas or circuits the machine has never seen before. This could be a circuit with some attributes the machine has seen before but could have different variants, such as temperatures, metallurgy configurations, or other design or operating conditions—but because of what the machine has learned, it can make reasonable predictions for this new scenario.
3. Validate with SME
Finally, once the machine has provided its estimates, they are sent back to the SME for review and validation. The SME can help identify the need for deviations or make sure that the results make sense both from a unit history and industry expectations. Any updates that are made are then fed back into the machines to help them continue to learn for future analysis. This approach marries subject matter expert knowledge with data science which provides a solution that’s better than either party could offer independently.
By combining the strengths of big data with subject matter expertise, we end up with the best of both worlds and with quality that exceeds what we’re able to do currently in the industry.
To learn more about how data science can be leveraged in predicting damage rates, watch the presentation below, where we discuss Pinnacle’s Reformer Study. This study compares the accuracy of asset degradation rates predicted by a machine learning model to the rates predicted by human SMEs applying current industry standards. Andrew Waters, Ph.D., and Fred Addington lead the discussion about how large data sets can be used to better predict asset degradation and the challenges of making “Big Data” work for facilities.