Root Cause Analysis

The primary goal of any maintenance department is to eliminate all sources of equipment failures and not just their symptoms. Preventing problems from arising should always be the purpose of maintenance. In this article, we will explain best practices and methods to carry out a successful Root Cause Analysis that will enable you to have more consistent and reliable plant performance.

Investigation

To begin root cause analysis, start by capturing all the failures that have occurred over a defined period of time in your plant. Large plants need to review this information daily but weekly or monthly review may suffice for smaller plants. The frequency and variety of failures also determine the frequency of investigation. 

The next step is to determine the time when a root cause analysis is required. You may do this by establishing a trigger that notifies you; and by accounting for the desired service life of your equipment. This time varies depending on the equipment type. For example, you can consider one year for pumps and three years for motors as a baseline.  

Considering these guidelines, any pump failure in less than one year and motor failing in less than three years would require an analysis.

Track using a Database

Tracking the number of failures is much more efficient if you create a database that allows viewing all root cause reports in one place. The Database necessarily needs to contain the following things: name or number of the equipment, the area where the equipment is located, the date of the failure and date of the previous failure, the notification or work order number, brief explanation of the failure, all the possible solutions and name of the person responsible for solving it.

Most of the information that is not readily available may be entered later in the investigation. But, it is important to include as much information as possible to establish the factors that are already known.

Gathering Information

Collect as much Data as possible about the equipment; it is also critical to understand what happened during the failure. Gather historical information about the equipment’s failure, including attempted solutions. If these solutions didn’t worked, you will be one step closer to identifying the root cause; if they worked, you won’t waste your time and resources by carrying out an analysis. 

Employ all of your resources. Speak with your mechanics, electricians, shift personnel, clean-up crews and anyone that knows the equipment. They may have important insights and clues about why the failure occurred. These individuals may even have solutions and suggestions for improvement.

Talk to the operatives that worked the shift when the failure occurred. Start this process as soon as possible so that the information is easily recovered or remembered with accuracy. Check your system that detects and tracks equipment failures to identify the frequency of issues and question your operatives accordingly; i.e., if the failure occurs on specific time or day, or it occurs in a specific timeframe.

If you perform the analysis early enough, you may even observe the equipment before complete failure, while it's still functioning but on the verge of a failure. For example, consider a leaking pump before it’s changed. In such cases, you can evaluate the running conditions of the equipment, which can even be the source of damage. 

Always disassemble the equipment for inspection. This will enable you to better analyze which components have failed by having a look inside the equipment. You may find misalignment that causes vibrations, signs of overheating, indications of lack of lubrication etc. 

Make sure to document everything using a digital camera, notebook or a tablet. It is useful to show to someone who hasn’t seen the failure and for your future reference.

Writing the Root Cause analysis report

The report has to be simple and easy to understand by anyone who does not have specialized knowledge or experience like you do. State the facts while avoiding complicated terminology and technical words. Refrain from using names; use job titles instead. The purpose of this investigation is not assigning blame or pointing fingers.   

Include images taken in the information gathering phase and caption them so that the object or situation shown is easily recognizable. Include the equipment’s information, a detailed explanation about the failure, the explanation of its history, the date of the failure and date of the last failure, an idea of the root cause, the proposed solution and name of the person(s) responsible for the solution. Also attach appropriate data that explains the failure like pictures, graphs and trends.

Reviewing Reports

Send the reports to everyone before conducting a meeting so that they get a chance to discuss issues and look for possible solutions beforehand. Scheduling a meeting like this allows people to conduct their own investigation and have more informed discussions during the meeting. The meeting should mandatorily be attended by maintenance coordinators (electrical and mechanical), maintenance managers, engineers, key process technicians, area process supervisors, electricians and mechanics. Attendance can be optional for the plant manager, operations manager and planner.      
 
Review every report, to make everybody aware of every failure, even if they were small. Discuss the next steps to prevent future failures, so that it is a group's decision rather than an individual's decision. If failures that occur are more frequent or severe, you need to have meetings more often.

Implement Changes

Assign a maintenance or reliability engineer who would create a method for tracking changes and monitor the operations closely. A date should be determined by the team when the proposed solution will be completed. The person assigned for monitoring the operations will then be contacted by the lead person to determine if the changes have been successfully made.

The lead person will then schedule a meeting with the team to review the solutions and to find out if additional time and resources are needed. This will also allow the team to perceive what has been done and what would follow.

Lastly, inform everyone in the plant about the results of the root cause analysis and the developments in the plant. As others in the plant understand the benefits of the root cause analysis, they will want to be involved.
This information is provided for guidance and informational purposes only. This website and information are not intended to provide investment, laboratory or manufacturing process advice.
The information contained herein has been compiled from sources deemed reliable and it is accurate to the best of our knowledge and belief. However, Castrol cannot guarantee its accuracy, completeness, and validity and cannot be held liable for any errors or omissions, as the results change depending on the working condition/environment.  Changes are periodically made to this information and may be made at any time.
All information contained herein should be independently verified and confirmed.