HomeLearnTopicsFailure Modes and Effects Analysis and the Evolution of Reliability Analyses

Failure Modes and Effects Analysis and the Evolution of Reliability Analyses

What Is a Failure Modes and Effects Analysis?

Failure Modes and Effects Analysis (FMEA) is a step-by-step qualitative method used to identify the functional requirements of an asset, system, or unit; the failures that can result in the loss of or inability to satisfy the functional requirements; equipment failure modes (what happened), failure mechanisms (what caused it to happen), and the potential effects of each failure. Performing an FMEA provides a sound basis for developing mitigation plans to proactively address potential failures before they occur. These mitigation plans focus on the highest consequence events, improving reliability and performance, increasing safety, and ensuring regulatory compliance while achieving business goals.

FMEA is the oldest and most universal risk analysis. The United States Department of Defense created this standard at the end of World War II. FMEAs apply to almost every type of organization and are used extensively in a variety of industries, including semiconductor processing, food service, plastics, software, and healthcare. It identifies potential problems before they occur and can identify issues in the design stages of a facility. A few different types of FMEA analyses exist, such as Functional, Design, and Process.

Why is Failure Modes and Effects Analysis (FMEA) Valuable?

A functional FMEA is a tool used in programs such as Reliability Centered Maintenance (RCM) to define the criticality of each asset, which is considered in developing an optimal maintenance and monitoring strategy. An FMEA considers the consequence of failure to determine an asset’s criticality, and when further risk analysis is needed, a Failure Modes, Effects, and Criticality Analysis (FMECA) may be performed. While similar to an FMEA, FMECA includes an additional level of risk assessment to better define an asset’s criticality score by incorporating both the probability and consequence of failure of the asset.

An assessment of risk prioritizes maintenance planning to clarify which assets must be maintained to sustain safe and effective operations, which assets to maintain for cost reasons, and which assets require no proactive maintenance. A special type of FMEA is used in Risk-Based Inspection (RBI) programs, which focus on fixed equipment and piping systems. The same methodology can also be used to conduct Process Hazards Analysis (PHA) and Root Cause Analysis (RCA).

Watch below to learn more about FMEA and the four basic questions you need to consider throughout your analysis.

  1. How can the asset fail?
  2. What can cause that failure to happen?
  3. What data can I collect and monitor to indicate when the modes or mechanisms are happening?
  4. What tasks can I do to mitigate or alleviate the failure?

How Is a Failure Modes and Effects Analysis Conducted?

A functional FMEA is performed by identifying the following:


  • Required operational expectations of the asset, system, or unit

  • Functional failures and listing the failure modes (operational or functional) for each asset

  • Potential effects at the various levels (local, system, unit, facility) if the mode were to occur

  • Contributing failure mechanisms

  • Risk (Probability x Consequence) of each failure mode (qualitative risk) (if FMECA is performed)

  • Incipient conditions (allows identification of what can detect a failure or its cause(s))
  • Recommended actions and making decisions on their application

Identical equipment can have different failure modes based on the specific installation. For example, some equipment may be prone to plugging in certain services but not others. Assets must be individually analyzed to determine the failure effects, which are completely determined by the system in which the asset is installed. Information compiled for the FMEA can be managed using simple spreadsheets or various software tools.

Case Study Highlight

Case Study: Global Petrochemical Company Improves Safety and Compliance While Reducing Unplanned Downtime with Reliability Centered Maintenance Program

A global petrochemical organization aimed to enhance safety, compliance, and operating costs through a non-fixed asset program. Pinnacle provided a robust RCM program that involved reviewing and updating the equipment list in the CMMS, conducting an FMEA, identifying critical spare parts, identifying reliability opportunities and vulnerabilities, and PM optimization. As a result, the facility met corporate compliance goals, increased safety, and reduced operating costs by implementing recommendations for an optimized reliability-based proactive maintenance program.

Read the full case study: Global Petrochemical Company Improves Safety and Compliance While Reducing Unplanned Downtime with Reliability Centered Maintenance Program

What Is the Next Evolution of Reliability Analysis?

FMEA/FMECA is an important tool for assigning equipment criticality, assessing risk, and driving maintenance planning.

With major advancements in data acquisition, warehousing, modeling, and analytics, we now have the opportunity to evolve into the next phase of reliability analysis and build upon the FMEA.

This leap is being made possible through Quantitative Reliability Optimization (QRO). QRO is an approach to reliability modeling that connects every relevant reliability data point at a complex facility to one integrated model, allowing for complex decision making that enables users to do things such as:

  • Near real-time optimization of all maintenance spend based on short/mid/long-term reliability targets.
  • Understand the economic value of every inspection or maintenance activity performed.
  • Understand the economic value of every piece of data currently being gathered or possibly collected in the future.
  • Near real-time scenario modeling, including the implications of moving a turnaround, feedstock pricing changes, or various capital projects.
  • Drive effective economic decisions in the event of reliability-based operating excursions, such as the change in the probability of failure if a piece of equipment operates outside its reliability operating window.

There are some key differences between QRO and FMEA. While QRO also analyzes functions, failure modes, failure mechanisms, and potential tasks, it uniquely ties the asset’s actual condition data to the identified failure modes to focus on the exact point in the asset’s life cycle. By focusing on this point, QRO can determine the current probability of failure and select which activities to perform to increase or maintain reliability by reducing the probability of failure. In addition, QRO only focuses on the data and tasks that will impact the reliability of the plant.

The QRO analysis is directly linked with quantitative asset data and reliability analytics to calculate the actual risk for each failure mode. Asset performance metrics and their associated impact on the system are also calculated. This updates in near real-time as new data becomes available to provide a trend of asset risk over time.

The QRO approach is universal regardless of asset type. For fixed and non-fixed assets, QRO focuses on gathering data to predict failures so that facilities can perform the proper maintenance activities at the right time.

To learn more about how FMEAs or QRO can help your program, set up a discovery call.

Stay in the know.