Fault Tree Analysis in Projects

Developed by Arvin Fattah

Fault tree analysis (FTA) is defined by the International Electrotechnical Commission (IEC) and the International Organization for Standardization (ISO) as a "technique for identifying and analysing factors that can contribute to a specified undesired event".^[1]

FTA has its wide range of application in many fields of engineering such as systems engineering, reliability engineering, and safety engineering.^[2] It also serves as an applicable tool for identifying the causes of undesired events in projects. Undesired events in projects can for instance be exceeding the budget, time delays, lack of team synergy, or any other events that have a negative effect on the project. These undesired events that are to be analysed in a FTA are referred to as the top event.^[3]^[4]

Risks will always be a part of projects, and the need to identify the risks and the impact they can have on projects can be crucial for the success of the project management. The purpose of FTA is thus to give both a qualitative and quantitative analysis of the factors that can trigger the undesired top event.^[2] A qualitative analysis shows via a graphical representation of a tree the top event that is to be analysed, along with the pathway of all the intermediate and base events that leads up to the top event. A quantitative analysis shows the probability of a top event being triggered by the input probabilities of the base events that leads up to the undesired event.^[4] The quantitative analysis is calculated through Boolean algebra.^[5]

History

FTA was first developed in 1962 at Bell Laboratories by H. A. Watson^[6]^[7] with the intent to apply it to the Minuteman Missile launch control system.^[6]^[8] The method was later adopted by Boeing^[8]^[9] and has since been used by many organizations such as NASA for their aerospace technologies.^[10] FTA is today a widely used tool in many engineering fields.

Methodology

Figure 1: An example of a fault tree analysis where exceeding the budget limits is the top event.

FTA is based on the analysis of a top event. This is an event that is believed to be of great importance to the project or an event which has not been given enough attention. A top event needs to be defined as a failure, as it is an undesired event that is to be avoided.^[4] Examples of such events can be the delay of the project if time is considered a constraint.^[3]^[5] Other projects may be funded by a tight budget where it is crucial to stay within the budget's limits, therefore a top event could be failure to meet the expected budget.^[3] Only one top event can be chosen for each fault tree, however it is often recommended to develop several fault trees^[4] each with their different top events if the project is of a large scale or if several top events are of great importance, such as the safety in a power plant.

Once a top event has been chosen, the causal factors for said event needs to be identified. Figure 1 is a representation of a basic fault tree with "Exceeded budget limits" as an example of a top event. All the causal factors are located beneath, which may trigger the top event.^[4] The graphical illustration of the fault tree consists of symbols with the intended purpose of clarifying the different relations between the different causal factors.^[4]^[5]

Symbols

Symbol	Description
Event	An event is one of the causal factors that can occur during the progress of a project.^[1] This event will be responsible to fully or partially trigger the event located at an upper level.^[4] For the example in Figure 1 it can be seen on the far left that an increase in demand will result in the primary supplier being short on stock, which then results in prices on materials increasing. The increase in prices will then lead to the top event, which is a failure to complete the project within the budget.

Base event	A base event is an event that is not analysed further because a further analysis on said event has been deemed not useful.^[1]^[4] Whether to further investigate an event or not is up to the project members, however it is important to indicate all base events as they are later used in the quantitative analysis of the fault tree. For instance in Figure 1 it was deemed unnecessary to find further causal factors for a market crash, as the project members may have considered the causal factors to be outside the scope of the project or if they deemed it impossible to mitigate or treat the risks associated with any further causal factors.

Unfinished event	Some events may not be of interest to the project at the current time, but may be of importance later in the project. These events are marked with this symbol so that the causal factors may be developed at a later stage in the project.^[1]^[4] An example is given in Figure 1 on the far right, where the project members were not currently interested in the causal factors for penalties but may develop these later in the project.

Transfer	Many projects will have multiple fault trees developed with each of their own specific top event. Some of these fault trees may have many large portions identical with each other. In an effort to keep the many different fault trees as short and easily read as possible, transfer symbols may be used to indicate that the identical causal factors can be seen in another fault tree.^[1] Figure 1 shows a transfer symbol under the event for governmental regulations, to indicate that the causal factors for this event can be found in another fault tree, in order to not repeat the exact same causal factors. The transfer symbol in this example is marked with an "A" to differentiate between many other transfer symbols. If different events in the same tree have the same exact causal factors, the transfer symbol can then also be used inside the tree only, in order to prevent it from growing too big with repeated causal factors.

OR gate	OR gates are used to describe the relations between the different causal factors in the same level.^[6] The OR gate indicates that any of the events beneath the gate can trigger the event above.^[1]^[4] As an example to this, Figure 1 indicates that there are 3 causal factors that can trigger the event where prices on materials increase. Either one of the 3 causal factors can trigger the upper event as they are not dependent on each other, therefore it is sufficient that just one of these events occur in order for the prices of material to increase. OR gates are indicated by a sum in the quantitative analysis using Boolean algebra.

AND gate	AND gates are similar to OR gates, in that they show the relations between the causal factors, however in this situation the causal factors now depend on each other. This means that an upper event will not occur until all of the causal factors under the AND gate occur.^[1]^[4] Figure 1 shows that in order for a budget estimation error to happen, not only must department A make the error but department B must also make the error for the estimation error to occur after the gate. If only one of the departments make the error but the other don't, then the upper event will not occur as the AND gate has not been satisfied. AND gates are indicated by a product in the quantitative analysis using Boolean algebra.

Boolean algebra

Figure 2: An example of a fault tree analysis.

Boolean algebra can be used for a quantitative analysis of a fault tree. By using Boolean algebra, it is possible to calculate the probability of different events, as well as the top event to occur. Figure 2 is the same fault tree as the previous fault tree in Figure 1, and all of the base events have been assigned a letter to identify them, with the assumption that the events governmental regulations and penalties are also base events. The top event is furthermore assigned the letter Q. For the quantitative analysis only the base events are of interest because they are the lowest causal factors, therefore the intermediate causal factors are of no importance to the analysis and are left blank.

OR gates denote a sum because the base events do not depend on each other while AND gates denote a product because the base events depend on each other.^[6] The algebraic representation of the fault tree thus becomes:

Which can be rewritten as:

<math>Q=(A+B+C)+(D\bullet E)+F</math>

Figure 3: A minimal cut set DE depicted on a fault tree. The path from the base events to the top event is shown in orange.

If the probabilities of the base events are known, it is then possible to calculate the probability for the top event, Q, to occur. Based on the equation from Boolean algebra, it is possible to find the different paths that can occur in the fault tree for the top event to occur. These different paths are referred to as cut sets.^[11] For instance if base event A occurs, then the top event will be triggered. The same goes for base events B, C, and F. However failure on base event D does not trigger the top event due to the AND gate, therefore a possible cut set with base event D would be DE as it requires base event E to occur as well. It is especially interesting in the case of a fault tree analysis to look at minimal cut sets, which are defined as the minimal amount of combinations of the base events that can cause the failure of the top event.^[2]^[11]

Example

In order to use the quantitative analysis for a fault tree, it is necessary to find the data for the probabilities of the different base events. If an internal study done for the project shows that department A (base event D) has a probability of 1.5% failure and department B (base event E) has a probability of 1.25% failure, then it is possible to look at the minimal cut set DE (as depicted graphically in figure 3) and calculate the overall chance for the top event Q to occur:

<math>Q=D\bullet E=1.5%\bullet 1.25%=1.875%</math>

Translating the results of the analysis can depend on the context. Whether or not action should be taken to improve the probability of failure depends on whether there are bigger threats to the project (other minimal cut sets that lead to higher probabilities of failure for the top event) or whether the project team has the means necessary to take action on the different base events.

Application

In general, FTA can be broken down into 3 steps when applying it to a project:^[3]

Defining the top event that is to be analysed.
Constructing the fault tree with all the associated events that can lead up to the top event along with their appropriate gates to describe their relations. It is important that the lowest level of events (base events) have been identified as single tasks^[3] where further analysis of said events would be deemed unnecessary for the overall analysis of the fault tree.
Calculating the probability of the top event occurring by using Boolean algebra, in order to assess the reliability of the fault tree.

For some projects it may not be necessary to execute step 3 if the focus is only to identify the risks and their dependencies with each other or if the team is unable to obtain the probability data for the different base events. Calculating the probability for the top event to occur is however crucial in many projects and will indicate whether a project is not reliable and has a high likelihood of failing. In case the fault tree analysis indicates that there is a high probability of failure, the team may want to reorganize the project’s logic or structure in order to make it more reliable.^[3]

Selecting top events

In order to successfully apply FTA to a project, the project needs to be clearly defined in terms of the goals and success criteria.^[3] A project can be described by three variables that are to be measured: Budget/money spent, Timeframe/time used, or Quality which assesses the different goals in the project.^[3] Top events that are used in fault trees during project management will always relate to one of these three variables. While time and budget related top events are easy to measure, quality or performance related top events can be much more difficult. Performance related top events can be any type of event that doesn’t relate to time or budget, and is a way to measure the success or failure of the project based on bad performance or bad quality of deliverables.^[3] Since it is difficult to measure performance, it is very important that the success criteria have been defined from the beginning and there is a clear definition on what type of performance and/or quality that is deemed acceptable within the project.^[3]

Identifying risks

Identifying the risks related to the project is of crucial importance, and the more thorough the risk identification is, the more detailed and insightful the fault tree becomes. Risk identification includes many different methods such as evidence based methods (i.e. historical data) or inductive reasoning techniques such as Hazard and operability study (HAZOP).^[1]^[2] Field observations by for instance using interviews or analysing project documents can be other ways of obtaining data.^[5] For the quantitative analysis, the probabilities of each of the identified risks need to be estimated in order to apply Boolean algebra and calculate the overall risk of the top event occurring. Some events can be based on evidence and other data such as market analyses that can estimate the probability of a shortage on a certain material. Other events are based on human activities such as budget estimation errors, which are much more difficult to measure. Various methods exist that can be used to estimate human activity errors, such as Technique for human error rate prediction (THERP) or Performance shaping factors (PSFs).^[3] Other times it can be useful to take advantage of expert judgments in order to correctly determine the probabilities of base events.^[5]

Minimal cut sets

Minimal cut sets are a part of the qualitative analysis of the fault tree, which seeks to find the smallest combination of the base events that will trigger the top event.^[5]^[11]^[12] The minimal cut set is important to identify because it gives the idea of what events need to happen in order for the top event to occur, which helps the team determine which actions to take to improve the reliability of the fault tree.^[12] By looking at the minimal cut sets of a fault tree, it can be seen which base events that are critical to the top event. Extra focus should be put on the minimal cut sets that can instantly trigger the top event.^[4] For instance base event A in Figure 2 is a minimal cut set because it is able to trigger the top event on its own, therefore additional risk treatment should be used on this base event in order to reduce the probability of this event occurring. However base events D and E depend on each other in order for the top event to occur, so less focus should be put on these. However it should also be noted that while minimal cut sets are of importance and show critical reliability problems in the fault tree, a subjective decision should also take place when treating the risks. For instance if base event A has a significant lower probability of occurring than the cut set DE then perhaps it is in the project team's best interest to focus on the latter as it has a higher probability.

Strengths and weaknesses

FTA offers many benefits for a project team to aid the overall successful completion of the project. The aim is to identify the critical parts of a project that compromises the reliability of the project, so that the project team can then reduce the overall risk for a specific event within the project to occur. FTA is able to cover many, if not all aspects of a project in terms of failure and success rate. By offering both a qualitative and quantitative analysis of the project risks, team members and project managers are able to create a visual representation of the project which results in a quantitative analysis on the probabilities.

In particular, there are many strengths and benefits to be achieved from the FTA:

A graphical overview of the risks involved that could contribute to the failure of the project.
The systematic approach of a FTA forces the analyst to understand the system (and thereby the project) thoroughly.^[2]
Additional fault trees can be created for the same project, taking several top events into consideration.
The method offers a systematic approach that is still flexible enough to incorporate different types of causal factors such as human behavior and physical phenomena.^[1]
Logical analysis of events/risks and their interdependencies. By using logical gates it is possible to see the relations between the different events and get a better understanding of the project itself.^[3]
The possibility to reduce risks of certain events when studying the probabilities of said events. Furthermore the calculations of the probabilities for the top event to occur creates grounds for minimizing the risk for whether the project fails.
Expanding the fault tree through the different AND/OR gates helps finding additional events that can cause the project to fail, which otherwise may not have been found due to uncertainty.^[3]
The choice of a top event allows for multiple different perspectives of a project to be analysed.^[3] While time and budget events were the easiest to measure in terms of failure and success, it is still possible to analyse practically any other aspect of a project with adequately defined project goals or success criteria.^[3]
Minimal cut sets help the project team prioritizing critical events that are more likely to trigger the top event to occur. This can also help the project manager save both time and money if they are able to correctly identify the risks that are most critical to the project and thus prioritizing accordingly.^[1]^[12]
FTA can be used with big and complex^[2] projects that can clarify the different interdependencies of the events, and can be easily analysed by using computer software.^[1]

Limitations

As with many other tools, FTA has certain weaknesses and limitations that needs to be addressed as well:

Probabilities for certain events are many times unknown due to uncertainty or lack of data. This can lead to inaccurate calculations of the probabilities of the top events which can have an adverse effect on the project.^[1]^[2] The human element in itself is difficult to measure and quantify^[2] which is especially important to consider when applying FTA to projects, as the human element plays a large role in project management.
It is not always possible to know whether all causal factors have been found for a particular top event or if more remain undiscovered.^[1]^[2]
As FTA is a static model, time-dimensions are not included which for many projects can be crucial.^[1]
As pointed out earlier, FTA is a systematic approach that allows flexible causal factors, however due to FTA only dealing with binary states (fail/success), it may not be flexible enough for certain projects.^[1]
It is rather difficult to include conditional factors into the fault tree.^[1]
Failures of degree are often difficult to implement into FTA. For instance, when dealing with top events that are either time delays or budget spendings, an increase in time may not necessarily result in the project being delayed, and similarly an increase in project spendings may not necessarily result in the budget being exceeded.^[1]^[2]
FTA can many times be costly and time-consuming to apply.^[2]

References

↑ ^1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 ^1.11 ^1.12 ^1.13 ^1.14 ^1.15 ^1.16 IEC/ISO 31010, "Risk management - Risk assessment techniques", 2009
↑ ^2.00 ^2.01 ^2.02 ^2.03 ^2.04 ^2.05 ^2.06 ^2.07 ^2.08 ^2.09 ^2.10 Richard E. Barlow, Jerry B. Fussell, Nozer D. Singpurwalla, "Reliability and Fault Tree Analysis", Society for Industrial and Applied Mathematics, 1975, p. 7-32
↑ ^3.00 ^3.01 ^3.02 ^3.03 ^3.04 ^3.05 ^3.06 ^3.07 ^3.08 ^3.09 ^3.10 ^3.11 ^3.12 ^3.13 ^3.14 Marcin Krysinski, and George Anders, "Fault Tree Analysis in a Project Context", 2005
↑ ^4.00 ^4.01 ^4.02 ^4.03 ^4.04 ^4.05 ^4.06 ^4.07 ^4.08 ^4.09 ^4.10 ^4.11 Patrick D.T. O'Connor, "Practical Reliability Engineering", 3rd Edition, John Wiley & Sons, 1992, p. 152-156
↑ ^5.0 ^5.1 ^5.2 ^5.3 ^5.4 ^5.5 Silvianita, Dirgha S Mahandeka, and Daniel M Rosyid, "Fault Tree Analysis for Investigation on the Causes of Project Problems", 2015
↑ ^6.0 ^6.1 ^6.2 ^6.3 Ernest J. Henley, Hiromitsu Kumamoto, "Probabilistic Risk Assessment: Reliability Engineering, Design, and Analysis", IEEE Press, 1991, p. 44-48
↑ iprr.org - The Investigation Process Research Resource Site Retrieved: September 17, 2016
↑ ^8.0 ^8.1 The Office of Scientific and Technical Information Retrieved: September 17, 2016
↑ weibull.com - Reliability Engineering Resource Website (FTA history) Retrieved: September 16, 2016
↑ NASA - Fault Tree Handbook with Aerospace Applications Retrieved: September 17, 2016
↑ ^11.0 ^11.1 ^11.2 weibull.com - Reliability Engineering Resource Website (FTA theory) Retrieved: September 16, 2016
↑ ^12.0 ^12.1 ^12.2 Robert Beresh, John Ciufo, and George Anders, "Basic Fault Tree Analysis for use in Protection Reliability", 2009

Annotated bibliography

IEC/ISO 31010, "Risk management - Risk assessment techniques", 2009

Annotation: A comprehensive collection of risk management theory, terminology, techniques and methods. The text has been created by two prominent organisations upholding the international standard. This text is good for getting a broader and general understanding of risk management. The theory and various risk management methods are described in a basic and accessible language for all types of practitioners, and many of the risk management methods are very relevant for the FTA. In particular methods like the Event Tree Analysis (ETA) and failure mode and effects analysis (FMEA) are described in detail which are closely related to FTA.

Marcin Krysinski, and George Anders, "Fault Tree Analysis in a Project Context", 2005

Annotation: This text offers a great in-depth analysis on how to apply FTA to projects. The authors use "delay of the project completion" as an example of how risk assessment can be used on projects. The article offers an in-depth analysis of the theory behind the risk assessment in projects, as well as the different criteria that are necessary to uphold. This study further shows more elaborate equations and calculations that can used when calculating the probabilities of failures and also gives many examples and criteria for the top events from a project management perspective as well as outlining how a FTA can relate to projects and be applied in a project management level.

Silvianita, Dirgha S Mahandeka, and Daniel M Rosyid, "Fault Tree Analysis for Investigation on the Causes of Project Problems", 2015

Annotation: The authors of this article focus on analysing project delays by using a FTA and the different ways that a FTA has been applied. The authors used interesting methods such as expert judgments and other methods to assess the risks and determine the probabilities of the different base events at a project management level which has been used to describe the FTA from a project management perspective for this Wiki article.

Robert Beresh, John Ciufo, and George Anders, "Basic Fault Tree Analysis for use in Protection Reliability", 2009

Annotation: This article offers a brief description of the FTA and how it can be applied in various situations. It explains the construction of a FTA on a basic level that is very accessible and easy to read for practitioners who may have very limited knowledge in the subject. It furthermore includes very good descriptions of the minimal cut sets that have been used in this article as a source. This article is recommended for further analysis of the minimal cut set analysis.

Patrick D.T. O'Connor, "Practical Reliability Engineering", 3rd Edition, John Wiley & Sons, 1992

Annotation: This book offers a lot of technical analysis regarding the FTA. This source does not relate much to project management however it goes in-depth with many of the reliability techniques and if a practitioner needs more information regarding some of the more analytical aspects of the FTA such as comprehensive calculations of cut sets and minimal cut sets, then this book is recommended. Furthermore it includes many examples for technical types of top events.

Richard E. Barlow, Jerry B. Fussell, Nozer D. Singpurwalla, "Reliability and Fault Tree Analysis", Society for Industrial and Applied Mathematics, 1975

Annotation: The authors of this book did a great job reflecting on the FTA and discuss the relevant pros and cons when applying a FTA, which has been used for this Wiki article under the section "Strengths and weaknesses". The discussion of the FTA in this book further extends to application of the FTA analysis by relating it to other risk management tools that can aid the analysis as well as describing the reliability aspects of FTA by discussing subjects like minimal cut sets.

Ernest J. Henley, Hiromitsu Kumamoto, "Probabilistic Risk Assessment: Reliability Engineering, Design, and Analysis", IEEE Press, 1991

Annotation: This source offers a technical aspect of a FTA, as well as many other risk analysis tools that can relate. This source is very heavy on calculations and the probabilistic and statistical calculations which can be relevant to the quantitative analysis of project management events in a FTA as far as the calculations go, and any practitioner who needs elaborate calculations methods for the probabilities in a FTA can seek the information in this book.

[iso-1] 1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 ^1.11 ^1.12 ^1.13 ^1.14 ^1.15 ^1.16 IEC/ISO 31010, "Risk management - Risk assessment techniques", 2009

[richard-2] 2.00 ^2.01 ^2.02 ^2.03 ^2.04 ^2.05 ^2.06 ^2.07 ^2.08 ^2.09 ^2.10 Richard E. Barlow, Jerry B. Fussell, Nozer D. Singpurwalla, "Reliability and Fault Tree Analysis", Society for Industrial and Applied Mathematics, 1975, p. 7-32

[krysinski-3] 3.00 ^3.01 ^3.02 ^3.03 ^3.04 ^3.05 ^3.06 ^3.07 ^3.08 ^3.09 ^3.10 ^3.11 ^3.12 ^3.13 ^3.14 Marcin Krysinski, and George Anders, "Fault Tree Analysis in a Project Context", 2005

[patrick-4] 4.00 ^4.01 ^4.02 ^4.03 ^4.04 ^4.05 ^4.06 ^4.07 ^4.08 ^4.09 ^4.10 ^4.11 Patrick D.T. O'Connor, "Practical Reliability Engineering", 3rd Edition, John Wiley & Sons, 1992, p. 152-156

[silvianita-5] 5.0 ^5.1 ^5.2 ^5.3 ^5.4 ^5.5 Silvianita, Dirgha S Mahandeka, and Daniel M Rosyid, "Fault Tree Analysis for Investigation on the Causes of Project Problems", 2015

[ernest-6] 6.0 ^6.1 ^6.2 ^6.3 Ernest J. Henley, Hiromitsu Kumamoto, "Probabilistic Risk Assessment: Reliability Engineering, Design, and Analysis", IEEE Press, 1991, p. 44-48

[iprr-7] iprr.org - The Investigation Process Research Resource Site Retrieved: September 17, 2016

[historical-8] 8.0 ^8.1 The Office of Scientific and Technical Information Retrieved: September 17, 2016

[weibull2-9] weibull.com - Reliability Engineering Resource Website (FTA history) Retrieved: September 16, 2016

[nasa-10] NASA - Fault Tree Handbook with Aerospace Applications Retrieved: September 17, 2016

[weibull-11] 11.0 ^11.1 ^11.2 weibull.com - Reliability Engineering Resource Website (FTA theory) Retrieved: September 16, 2016

[kinectrics-12] 12.0 ^12.1 ^12.2 Robert Beresh, John Ciufo, and George Anders, "Basic Fault Tree Analysis for use in Protection Reliability", 2009

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Fault Tree Analysis in Projects

Contents

History