Developed by Jacob Gjerstrup

Fault Tree Analysis

Fault tree analysis (Hereby noted as FTA) is a technique primarily used within Risk analysis. It has been around for about 50 years, and in all those years, it has changed very little, as what is does is both rather basic but at the same time very powerful: FTA provides a visual representation of an undesired event, as well as the dependencies of said event, thereby allowing one to identify and analyse what factors can contribute to this event, also called base events. Finally, it allows one to calculate the probabilities of the top event.

All in all, FTA is a very powerful tool in managing risks, and allows for good visualizations of events and allows a displined, highly systematic, flexible approach to analysing these risks.

Big Idea

History

Fault tree diagrams were originally invented in 1962 by the Bell Telephone Laboratories. They did this on behalf of the US Air Force in connection with the Minuteman ICBM launch control system. It was very succesful, and were subsequently adopted by the Boing Company, then the US army, then US government and, in today's world, it is used widely in System Safety and Reliability Engineering, as well as many other major fields of engineering, and can be applicated to almost any project that needs to know the effect of various events and how they connect with other events. Furthermore, FTA has not changed much in the recent times - the biggest changes seems to be in terms of which gates are excluded, for instance, the Danish standard's guide to FTA^[1] uses fewer gates and events when compared to, for instance, NASA^[2].

Concept and purpose

FTA is a top down analysis where one identifies the undesired state, obtains information about the rest of the system and how various events are related to the undesired state, and from this knowledge, a fault tree is produced. This tree consists of various events and gates, with the undesired event being on the top - also referred to as the top event as a result - and base events at the bottom, with various other events distributed inbetween. All of these events are connected by various gates, all with specific purposes based in Logic. To further understand these Fault trees, an example below has been provided.

Figure 1: A fault tree from a Shale Gas case depicting the risks associated with worker injury. To enlarge, please click the picture.

Figure 1 shows a fully developed fault tree. It is rather large fault tree, however, but it shows how one can choose a top event and then, through developing each intermediate event, reach the bottom of the tree. The triangles, in this particular case, refers to other fault trees developed in this case, and including these would have made the tree even larger and would have been counterproductive.

As a result of the structure of Fault trees, with the events connected by logical gates, Boolean algebra can be applied to them once the Tree has been fully developed. This allows for further analysis of the fault tree, possibly reducing its size through a method known as "The Minimal cut set", which will be explained later in this article. This Boolean algebra is part of its purpose - it allows for various defined rules, while still allowing the author of the Fault tree a great degree of flexibility in depicting the Top event as well as the entire system, thus allowing for a good overview of all relations within the system connected to the top event, while still allowing for methods to reduce the complexity to the bare minimum. It also allows for calculating the probability of the top event happening if the probability of the base events is known.^[3]

Applications

FTA is widely applicable concept that can be used in many different cases. Currently, it is primarily used within System Safety and Reliability Engineering, where it is widely applied by many firms, including NASA^[2] and the US nuclear energy sector.^[4]. Typically, there are two types of fault trees - "simple" Fault trees that involve basic symbols, and Advanced fault trees that involves more complicated symbols. Both of these concepts will be explained in the sections below.

Basic fault trees

Figure 2: The basic figures of a fault tree

A basic fault tree consists of 6 different symbols. Of these, two are gates and 4 are events, as seen by figure 2.^[5]^[1]

And gate: An And gate has two or more inputs and one output. If all inputs are true, then the output will be true as well, thus causing the event above the gate, but if just one event is false, the event above will not happen
Or gate: Or gates are mostly the same as And gates - where they differ is that only one input has to be true to cause the above event, and all inputs has to be false to not cause the above event
Base event: An event that is not analysed further, meaning that it could either not be broken down into further detail, or doing so would be counter-productive. In the example of figure 1, base events are the roots of the tree, and denoted BE1, BE2, BE....., BE13, for the 13 different base events in that specific case.
Event that is not analysed further: This group of events are usually events that lack data, meaning that further analysis is meaningless.
Event that is analysed further: Intermediate events that are analysed further.
Event analysed on a different page: Used as a link to make huge Fault trees into smaller trees, allowing for a better overlook.

These 6 symbols are then used by defining the top event (also known as the undesired event) and breaking down what events could cause this event. Once these events have been identified and paired with the top event through and- or Or-gates, the next round of intermediate events are broken down into new intermediate events and connected through gates. This process continues until further analysis is unproductive, thus resulting in base events.

Figure 3: Minimal cut set example

Once a fault tree has been finished, a minimal cut set can be calculated. This minimal cut set is a set of the minimum amount of base events that will cause the top event - in the example of Figure 3, the minimal cut set is either {1, 2, 3} or {1, 2, 4} since either of these three events together will cause the top event. What makes these cut sets especially neat are the fact that they give an easy overview over the easiest path to the top event, and these minimum cut sets also displays which basic events are both necessary and sufficient to produce the top event. This minimum cut set can then be used to generate a new fault tree, and through this fault tree, provided that the probability of the base events are known, it is possible to calculate the probability of the top event. These calculations are done through ^[6]

Advanced Fault trees

Once a user has understood how to make the basic fault trees, there exist several additional type of gates, as well as two additional events.^[5] These are as follows:

Gates:

Voting Or: Voting Or acts as a normal Or gate, except that "k" or more input events must be true before output occurs. Symbol: Like an Or-gate, with "k" in the middle, where k is an integer with the amount of inputs that must occur for the output to happen.
XOR, also known as Exclusive Or: Exclusive Or is a gate that only allows the output to happen if and only if one input is true and all other inputs are false. Symbol: An Or-gate symbolized inside an And-gate
Priority And: This type of gate lets the output occur if the input happens in a specific sequence. Depiction varies, but typically like an And-gate with an extra flat line in the bottom.
Inhibit: Inhibit gates allows for a certain output to happen in case the input occurs while an enabling condition is also true. Depiction: A hexagon with inputs from below, outputs on top, and conditioning events from the side.

Events:

External Event: An event that is assumed to occur, always. Traditionally has a fixed probability of either 0 or 1. Depiction: A square with a triangle on top.
Conditioning Event: This type of event typically occurs in combination with Inhibit gates, but can actually be used in combination with any other gate as well, setting a specific condition for the gate that it is applied to. Depiction: An ellipse that is layig down.

Strengths

FTA possess several strengths. These are as follows:

Highly systematic, disciplined, flexible approach
Attention on failures directly related to top event
Displays all interfaces and interactions in systems
Easy understanding of the cause and effect
Provides a method to do logic analysis on the top event

To elaborate, FTA has a highly systematic, disciplined approach when it comes to modelling. Typically, such models are inflexible when it comes to modelling many different factors - however, FTA allows for precisely this, thus remaining flexible as well. Furthermore, the structure of FTA allows for several strengths, too: First of all, the fact that FTA is a top down approach means that the attention of the analysis is automatically focused on failures directly related to the top event. Secondly, since the structure allows for displaying all interfaces and interactions in the analysed system, it is very useful in systems that possess many of such interfaces and interactions, simply because it allows for a nice overview of all the interactions between these. In a more general sense, the structure of FTA actually allow the viewer an easy understanding of cause and effect in any system that it is applied to, however, larger systems may need to be split up into several trees to keep this easy understanding property alive. Finally, FTA enables logic analysis to be applied on the fault trees due to the fact that there are only binary states, thus allowing for the minimal cut set to be found, allowing for a simple way of finding failure pathways that might otherwise have been missed.^[1]

Limitations

Just as FTA has several strengths, so there are several limitations in the FTA model. Below follows a brief overview, of each limitation as well as an elaboration on each subject.

Uncertainties in the top event: Due to the fact that the probability of the top event is calculated from the probability of the base event and the interconnected events, if the probability of the base events are not known accurately, it will cause uncertainty in the rest of the system.^[1]

The whole picture is not discovered: Sometimes, causal events are not discovered, or intermediate events are missing, thus creating a fault tree that does not cover the entire system. In this case, it prevents probability analysis until these events are discovered, or at the very least, forcing one to recalculate the probabilities when more events shows up. ^[1]

FTAs are a static model: Since FTAs are static models, time is not taken into account in the model.^[1]

Fault trees only possess binary states: Fault trees only possess binary states, and as such, partial failures cannot be represented in these trees. This means that a component that fails partially, such as a Tank Rupture in Figure 1 (thus failing), but no oil spills out (but only fails partially), cannot be depicted.^[1]

Human error is not easily included: Since human error varies greatly, and since Fault trees only posses binary states, one either has to include a lot of different events to compensate for possible human failure, which clouds up the diagrams, or simplify it with a simple "Human error" state, which does not show the complete picture. As such, showing human error in fault trees is not easily done.^[1]

Large: As systems grows more complex, so will the fault trees. As a result, in the modern time where many systems are interlinked, fault trees might easily become very large and complex to both generate, understand and work with. Various computer tools can reduce this effect, but such tools must first be obtained and understood, thus reducing the effect of the simplicity of FTA. Furthermore, every tree may be split up into multiple trees through the link, such as Figure 1, but even with this as well, Figure 1 is still a large and complex tree.

FTAs do not easily enable domino effects: Domino effects, which are low-probability, high-consequence accidents, are not easily depicted in FTA due to the fact that all symbols are the same size, each symbol do not show how bad the consequences are, and will only show the probability, which, by the very nature of these effects, are considered to be low.^[1]

Conclusion

All in all, FTA is a very powerful tool that can be applied to many different areas. It does possess some limitations, mainly that Fault trees can easily become huge and confusing, thus defeating one of the prime purposes which is to provide an overview. With that said, though, the strengths more than make up for those limitations that cannot be worked around. Furthermore, FTA is widely used, meaning that there are many computer tools for assisting in developing FTAs, allowing for much easier calculation of the probability of the top event, as well as calculating the minimal cut set, too. Finally, these tools can also check the consistency of the fault tree, making sure that the fault tree created actually makes sense.

Annotated Bibliography

http://asq.org/quality-progress/2002/03/problem-solving/what-is-a-fault-tree-analysis.html :Provides a brief overview over the fault tree analysis concept, as well as another example of FTA.

http://www.weibull.com/basics/fault-tree/ : Provides an intermediary overview over the FTA concept, both in terms of concept and application. Also provides further references for more reading.

http://www.hq.nasa.gov/office/codeq/doctree/fthb.pdf : Nasa's handbook to building fault trees - a very in depth description of the concepts as well as how to apply them.

http://www.weibull.com/hotwire/issue63/relbasics63.htm :Provides a more in-depth description of what a minimal cut set is

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 ^1.8 Risk management - Risk Assesment Techniques, Dansk Standard, 2010
↑ ^2.0 ^2.1 NASA - Fault Tree Handbook with Aerospace Applications, accessed the 21/9, 2015[[1]]
↑ QP: What is a Fault Tree Analysis?, visited the 21/9, 2015, [[2]]
↑ US.NRC: Fault Tree Handbook, visited the 28/9, 2015, [[3]]
↑ ^5.0 ^5.1 Weibull Fault Analysis, visited the 13/9, 2015, [[4]]
↑ Weibull Minimal Cut Set, visited the 13/9, 2015, [[5]]

[RiskMan-0] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 ^1.8 Risk management - Risk Assesment Techniques, Dansk Standard, 2010

[NASA-1] 2.0 ^2.1 NASA - Fault Tree Handbook with Aerospace Applications, accessed the 21/9, 2015[[1]]

[asq-2] QP: What is a Fault Tree Analysis?, visited the 21/9, 2015, [[2]]

[ureg-3] US.NRC: Fault Tree Handbook, visited the 28/9, 2015, [[3]]

[weibullFT-4] 5.0 ^5.1 Weibull Fault Analysis, visited the 13/9, 2015, [[4]]

[weibullMCS-5] Weibull Minimal Cut Set, visited the 13/9, 2015, [[5]]

[1]

[2]

[3]

[4]

[5]

[6]

Fault tree analysis

Contents

Big Idea

History

Concept and purpose

Applications

Basic fault trees

Advanced Fault trees

Strengths

Limitations

Conclusion

Annotated Bibliography

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Toolbox