What Is Reliability Centered Maintenance?

Reliability Centered Maintenance (RCM) is a method for developing a comprehensive reliability-based maintenance and monitoring program. RCM involves analyzing the failure modes and effects (FMEA) of each piece of equipment to determine criticality; then deciding the most effective maintenance, operation, or engineering tasks to preserve system function. This approach is valid for systems and units within most process industries, such as electric power (nuclear, fossil, hydro), petrochemical, refining, upstream, midstream, manufacturing, paper, pharmaceutical, and water and wastewater treatment. The overall purpose of this evaluation is to develop a cost-effective and applicable proactive maintenance program for the system or unit under study. The evaluation includes:

  • Piece-by-piece evaluation of the equipment integral to the operation of the system or unit

  • State the effects of these failure modes

  • Identifying the most-likely modes of failure of each piece of equipment

  • Selection of applicable and effective proactive maintenance tasks to address the identified failures, including the possible recommendation of no proactive maintenance for certain equipment

RCM has traditionally been thought of as a one-time study that follows a rigorous set of questions designed to identify the failure modes and effects of equipment and define a set of tasks to mitigate these failure modes. Equipment criticality is defined based on the consequence of these failures and tasks were preferentially applied to the critical equipment over the non-critical equipment. These traditional RCM methodologies tend to rely on industry-accepted data and personnel experience to determine appropriate failure modes and mitigating tasks.

History of RCM

1950s

Reactive Maintenance

1960s

Aeronautical Industry & US Navy Transition To Proactive Maintenance Strategies

1970s

Nuclear, Chemical, and Oil & Gas Transition To Systematic Reliability Improvements With PRA and QRA

1970s-1980s

Industrial Growth Slows Due To Economic Recession

1990s

New Regulations and Inspection Standards Released To Prevent Loss Of Containment Incidents

1990s-2000s

New Data Gathering Techniques Developed Enhancing Risk Models

2022

New QRCM Methodology Combines SME Knowledge, Data Analytics, and Traditional Methods

An enhancement to traditional RCM was the introduction of risk into the analysis. Risk is the product of probability of failure (PoF) and consequence of failure (CoF). Most RCM analyses in the past 20 years have incorporated the probability of failure into the analysis to define equipment criticality by developing risk matrices for various criteria such as safety, environmental impact, production loss, financial impact, and reputation impact. An added benefit of using risk matrices is the ability to add a gradient to the level of criticality. The added levels of criticality on the risk matrices allows for an improved prioritization of the proactive maintenance tasks identified by the RCM analysis. For example, proactive tasks for high critical equipment have a higher priority than proactive tasks for low critical equipment.

What is the Value of Reliability Centered Maintenance?

The value of an individual Reliability Centered Maintenance (RCM) study will vary for each analysis, based on current equipment reliability, amount of change from the current proactive maintenance program, and market conditions. Typically, the value of an RCM study can be calculated over time based on increased availability, increased throughput, and/or reduced maintenance expenditures.

An RCM study and implementation of the recommended tasks from the study will yield the following benefits:

  • Increased Availability and Reliability of systems and equipment evaluated

  • Optimized Preventive Maintenance Program

  • Decreased lifecycle maintenance costs

  • Documented basis for prioritizing turnaround or Shutdown Maintenance

  • Documented basis for optimizing spare parts inventories

  • Identified gaps where training or new procedures are needed

  • Identified need for Root Cause Failure Analysis

  • Documented FMEA with tasks for individual equipment that can be used as a training tool

Case Study Highlight

How a Wastewater Treatment Plant is Recognizing $100 MM in Cost Savings through Reliability Centered Design

Pinnacle recently worked with a wastewater treatment facility undergoing an estimated $1.7B expansion to increase treatment capabilities due to stricter regulations and increased supply demand. Learn how we supported the facility by introducing Reliability Centered Maintenance (RCM) principals during the design phase to create a cost-effective way to maintain critical assets.

Read the full case study: How a Wastewater Treatment Plant is Recognizing $100 MM in Cost Savings through Reliability Centered Design

Case Study Highlight

Global Petrochemical Company Improves Safety and Compliance While Reducing Unplanned Downtime with Reliability Centered Maintenance Program

Learn how a global petrochemical leader wanted to improve safety, compliance, and operating costs through a non-fixed asset program to effectively manage operating risks. Pinnacle supported the organization with a Reliability Centered Maintenance program that included a review and update of the equipment list in its CMMS, FMEA, spare parts optimization, identification of reliability opportunities and vulnerabilities, and PM optimization.

Read the full case study: Global Petrochemical Company Improves Safety and Compliance While Reducing Unplanned Downtime with Reliability Centered Maintenance Program

What is the Benefit of a Pinnacle-Facilitated Reliability Centered Maintenance Study?

Primarily, the Pinnacle Reliability Centered Maintenance (RCM) methodology is designed to minimize the impact on customer resources. Several companies have had a poor experience with RCM implementation due to excessive time requirements to complete the analysis of a few major pieces of equipment. Our technique allows for the analysis of an entire unit in a similar amount of time. All maintainable equipment is included in the scope of our studies. This includes fixed equipment, rotating equipment, electrical equipment, instrumentation, and actuated valves. Detailed inspection recommendations for fixed equipment and piping are deferred to a specialized Mechanical Integrity (MI) or Risk-Based Inspection (RBI) analysis. The analysis focuses on process equipment, but non-process equipment (HVAC, safety equipment, firefighting equipment, material handling equipment, lighting) can also be included for a comprehensive proactive maintenance program.

The Pinnacle RCM process is a methodical, efficient, and common-sense approach to developing a reliability-based maintenance and monitoring program that conforms to accepted RCM principles. Each step in the process is foundational as each subsequent step presupposes completion of the steps preceding it. The basic steps in our RCM process are outlined below.

Step 1: Data Collection

Collect the following essential information:

  • Equipment List
  • Equipment details
  • Piping and Instrument Diagrams (P&IDs)
  • Single Line Diagrams (SLDs)
  • Process description
  • Related reliability and safety analyses
  • Cause and Effect Diagrams

Step 2: Performance Objectives

Identify the overall operational and business objectives. Examples include:

  • Production and Throughput Targets
  • Product Quality Specifications
  • Safety and Environmental Compliance Targets
  • Availability Targets
  • Tolerance for Downtime
  • Factors that Limit Run Length

Step 3: Functions

Define the functions (i.e., systems) in the unit. The RCM analysis is based on preserving system functionality, not preventing individual equipment failures.

Step 4: Process Interview

Review individual equipment to answer the basic FMEA questions:

  • What is the component function?
  • How does the component fail?
  • What is the effect of failure?
  • Is the failure evident?
  • What is the operator response?
  • Is the equipment/component/system performing properly and reliably?

Step 5: Failure Modes and Effects Analysis (FMEA)/Criticality Evaluation

Conduct the FMEA for all equipment to designate criticality for each equipment (i.e., critical or non-critical). Perform risk ranking analysis to designate level of criticality for each equipment (e.g., high, medium, low).

Step 6: Task Selection

Develop a proactive maintenance program for each piece of equipment. Appropriate cost-effective tasks will be selected to maintain the system functions, minimize the effects of equipment failure, and improve equipment availability. No proactive maintenance (i.e., run to failure) will be designated to several appropriate equipment.

Step 7: Implementation

Implement the RCM-recommended tasks into the appropriate work management systems (e.g., CMMS, operator rounds and readings). Implementation of the RCM recommendations is the most crucial step in the RCM process

What Does the Future of Reliability Look Like?

Over the years, Reliability Centered Maintenance (RCM) has been a valuable method that has helped complex asset-based systems effectively maintain asset reliability through cost-effective strategies. Now, decades later, advancements in data acquisition, warehousing, modeling, and analytics are creating opportunities to improve upon the RCM model. The next leap in reliability will further improve availability while continuing to reduce maintenance spend.

An RCM program is the first step after a reactive program in the evolution of maintenance program maturity. However, when it comes to further optimizing and improving reliability performance, Reliability Centered Maintenance can be limiting depending on the objective. Specific limitations include the following:

  • RCM does not calculate absolute risk but rather relative risk, using that to drive maintenance priorities rather than an objective cost/benefit analysis.

  • RCM models are typically very conservative around the calculation of probability for failure and consequence of failure, due to the fact that they are predominately based on likelihoods that the event or failure mode would occur that often stem from SME knowledge.

  • RCM analysis are static by nature and typically do not incorporate the condition data of the asset to update the risk calculations (especially the Probability of Failure calculations) in order to drive maintenance priorities accordingly.

  • RCM calculations occur on asset-by-asset basis and generally do not relate to the overall performance of the system, unit, or facility.

  • Although RCM program architectures are generally similar (criticality analysis, FMEA, etc.) they are subject to interpretation regarding how to the analysis is structured.

  • RCM does not help quantify the value of data collection or help with sensitivity analysis of required data for calculations beyond manual Iteration of values from the user.

  • RCM cannot be used to optimize an entire system, unit, or facility’s reliability strategy based on availability, cost, and resource constraints.

Whether you are just starting to implement an RCM program or are already using a matured program, you have the necessary tools to make the next leap in reliability. We believe this leap is being made possible through Quantitative Reliability Optimization (QRO). Quantitative Reliability Optimization (QRO) is a method that pushes RCM to the next level by unlocking its capabilities through the use of dynamic data analysis. Using actual asset data, QRO provides detailed insight when it comes to understanding when an asset will fail, identifying the impact each asset has on the larger system, and knowing how and when to use resources improve the results you care about.

QRO is an approach to reliability modeling which connects every relevant reliability data point to one integrated model that enables users to do things such as:

  • Optimize, in near real-time, all maintenance spend based on short/mid/long-term reliability targets.

  • Understand the economic value of every maintenance activity based on dynamic condition models (such as Probability of Failure) that update in a near real-time manner as new data is introduced in order to quantify where an asset is on its P-F Curve to determine when tasks should be performed.

  • Understand the quantifiable impact each asset has on availability

  • Understand the economic value of every piece of data that is currently being gathered or possibly gathered in the future.

  • Simulation analysis to evaluate the impacts of maintenance and operational activities, including the implications of moving a turnaround, feedstock pricing changes, or various capital projects.

  • Drive effective economic decisions in the event of reliability based operating excursions.

More resources like this