What Is a Failure Modes and Effects Analysis?

Failure Modes and Effects Analysis (FMEA) is a step-by-step qualitative method used to identify the functional requirements of an asset, system, or unit; the failures that can result in loss of or inability to satisfy the functional requirements; equipment failure modes (what happened), failure mechanisms (what caused it to happen), and the potential effects of each failure. Performing an FMEA provides a sound basis for developing mitigation plans to proactively address potential failures before they occur. These mitigation plans focus on the highest consequence events, improving reliability and performance, increasing safety, and ensuring regulatory compliance while achieving business goals.

An FMEA is a foundational tool used in programs such as Reliability Centered Maintenance (RCM), as the importance and risk of each asset is considered in developing an optimal maintenance and monitoring program. The risk of an asset is determined by multiplying the probability of failure by the consequence of failure. An FMEA considers the consequence of failure to determine an asset’s criticality, and when further risk analysis is needed, a Failure Modes, Effects, and Criticality Analysis (FMECA) may be performed. While similar to a FMEA, a FMECA includes an additional level of risk assessment to better define an asset’s criticality score by incorporating both the probability and consequence of failure of the asset.

An assessment of risk prioritizes maintenance planning to clarify which assets must be maintained to sustain safe and effective operations, which assets to maintain for cost reasons, and which assets require no proactive maintenance. The same premise is used in Risk-Based Inspection (RBI) programs, which focus on fixed equipment and piping systems.

Watch below to learn more about FMEA and the four basic questions that you need to consider throughout your analysis.

How Is a Failure Modes and Effects Analysis Conducted?

An FMEA is performed by identifying:

  • Required operational expectations of the equipment

  • Functional failures and listing the failure modes (operational or functional) for each asset

  • Potential effects at the various levels (local, system, unit, facility) if the mode were to occur

  • Contributing failure mechanisms

  • Risk (Probability x Consequence) of each failure mode (qualitative risk)

  • Incipient conditions (what can detect a failure or its cause(s))

  • Recommended actions and making decisions on their application

Several software tools can manage a FMEA. Identical equipment can have different failure modes based on the specific installation. For example, some equipment may be prone to plugging in certain services but not in other services. Assets must be individually analyzed to determine the failure effects, which are completely determined by the system in which the asset is installed.

What Is the Next Evolution of Reliability Analysis?

FMEA is an important tool for assessing and prioritizing risk and driving maintenance planning.

With major advancements in data acquisition, warehousing, modeling, and analytics, we now have the opportunity to take the next leap in reliability analysis and build upon the FMEA.

This leap is being made possible through Quantitative Reliability Optimization (QRO). QRO is an approach to reliability modeling that connects every relevant reliability data point at a complex facility to one integrated model, allowing for complex decision making that enables users to do things such as:

  • Near real-time optimization of all maintenance spend based on short/mid/long-term reliability targets.

  • Understand the economic value of every inspection or maintenance activity performed.

  • Understand the economic value of every piece of data currently being gathered or possibly collected in the future.

  • Near real-time scenario modeling, including the implications of moving a turnaround, feedstock pricing changes, or various capital projects.

  • Drive effective economic decisions in the event of reliability-based operating excursions such as the change in the probability of failure a piece of equipment is operated outside of its reliability operating window.

There are some key differences between QRO and FMEA. While QRO also analyzes functions, failure modes, failure mechanisms, and potential tasks, it uniquely ties the asset’s actual condition data to the identified failure modes to focus on the actual point in the life cycle of the asset. By focusing on this point, QRO can determine the current probability of failure and select which activities to perform to increase or maintain reliability by reducing the probability of failure. In addition, QRO only focuses on the data and tasks that will impact the reliability of the plant.

The QRO analysis is directly linked with quantitative asset data and reliability analytics to calculate the actual risk for each failure mode. Asset performance metrics and associated impact on the system are also calculated. This updates in near real-time as new data becomes available.

The QRO approach is universal regardless of asset type. Regardless of if the asset is fixed or non-fixed, QRO focuses on gathering data to predict failures so that facilities can perform the proper maintenance activities.

Learn more about Quantitative Reliability Optimization (QRO).

More resources like this