# Welcome To

Welcome to R&S Training Course CERN, February 2002 Reliability and Safety (R&S) Training Course P. Kafka, ESRA, Reconsult Content R&S Training Course CERN, February 2002 Module 1: Basic Elements in Reliability Engineering Module 2: Interrelations of Reliability & Safety (R&S) Module 3: The ideal R&S Process for Large Scale Systems Module 4: Some Applications of R&S on LHC Module 5: Lessons Learned from R&S Applications in various Technologies Module 1: Basic Elements in Reliability Engineering Content:

Short R&S History Some Basic Terms A few Definitions and Formalisms From Components to Systems Important Methods Common Cause Failures Human Factor Issues Types of Uncertainties R&S Training Course CERN, February 2002 Module 1: Short History of R&S as Synonym of Risk R&S Training Course CERN, February 2002 Risk very old Term (Perikles; 430 v.Chr) the worst thing is to rush into actions before the consequences have

been properly debated, and the Athenians are capable at the same time of taking Risk and Estimating before-hand Trial and Error Approach (00 40) Worst Case - Safety Case Studies (40 ) Recognition of Stochastic Events (40) Development of Reliability Theory (40 -) Reliability Studies for Complex Systems (50 -) Comprehensive Risk Studies (70 -) Global Risk Management: based on: Goal Assignment Proof (90 -) Risk Informed Decision Making (95 -) Risk Studies for Large Scale Test Facilities just in the beginning (00 - ) Time Axis Module 1: Some Basic Terms

R&S Training Course CERN, February 2002 Reliability: The ability of an item to operate under designated operating conditions for a designated period of time or number of cycles. Remark: The ability of an item can be designated through a probability, or can be designated deterministic Availability: The probability that an item will be operational at a given time Remark: Mathematically the Availability of an item is a measure of the fraction of time that the item is in operating conditions in relation to total or calendar time Module 1: Some Basic Terms R&S Training Course CERN, February 2002 Maintainability: The probability that a given active maintenance action, for an item under given conditions of use can be carried out within a stated time interval when the maintenance is performed under

stated conditions and using stated procedures and resources (IEC 60050)1) Remark: probabilistic definition Safety: Freedom from unacceptable risk of harm Remark: very vague definition RAMS: An acronym meaning a combination of Reliability, Availability, Maintainability and Safety Module 1: Some Basic Terms R&S Training Course CERN, February 2002 Todays Understanding for Purists Reliability Availability Dependability Maintainability Safety

Module 1: Some Basic Terms R&S Training Course CERN, February 2002 Hazard: A physical situation with a potential for human injury, damage to property, damage to the environment or some combination of these Individual Risk: The frequency at which an individual may be expected to sustain a given level of harm from the realisation of specified hazards Social Risk: The frequency with which a specified number of people in a given population, or population as a whole, sustain a specified level of harm from the realisation of specified hazards Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 For non-repaired items the reliability function: t

R(t) = exp [ - (x)dx] = f(x)dx 0 where (x) f(x) t is the instantaneous failure rate of an item is the probability density function of the time to failure of the item when (t) = = constant, i.e. when the (operating) time to failure is exponentially distributed R(t) = exp(-t) Example: For an item with a constant failure rate of one occurrence per operating year and a required time of operation of six month, the reliability is given by R(6m) = exp(- 1 x 6/12) = 0,6065 Module 1: A few Definitions and Formalisms R&S Training Course

CERN, February 2002 Failure Rate follows normally the so-called bath tube curve Utilisation Phase Constant Failure Rate Debugging Wearout Time Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Failure Rate are often published in Date Books A few Examples Offshore Reliability Data; OREDA Handbook; 2nd Edition; distributed by Det Norska Veritas Industri Norge AS; DNV Technica 1992 ISBN 82 515 0188 1 Handbook of Reliability Data for Electronic Components; RDF 93 English Issue 1993; Copyright France Telecom CNET 1993 Reliability Data of Components in Nordic Nuclear Power Plants; T-book

3rd Edition; Vattenfall AB; ISBN 91-7186-294-3 EUREDATA; Published by Joint Research Centre (JRC) Ispra, It Links for Data Informations see at ESRA Homepage Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 For non-repaired items: If observed failure data are available for n non-repaired items with constant failure rate, then the estimated value of is given by n = n / TTFi i=1 where TTFi is time to failure of item i Example: For 10 non-repaired items with a constant failure rate, the observed total operating time to failures of all the items is 2 years. Hence = 10/2 = 5 failures per year Module 1: A few Definitions and Formalisms

R&S Training Course CERN, February 2002 For non-repaired items: MTTF = R(t)dt Mean Time To Failure 0 When time to failure is exponentially distributed, MTTF = 1 / Example: For a non-repaired items with a constant failure rate of two failures per four years of operating time, MTTF = 1 / 2 / 4 = 2 years = 17.520 h Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 For repaired items with zero time to restoration the reliability function is given t1 R(t1,t2) = R(t2) + R(t2 t) z(t)dt

0 where R(t2), represents the probability of survival to time t2, and the second term represents the probability of failing at time t(t< t1) and, after immediately restoration, surviving to time t2 z(t) is the instantaneous failure intensity (renewal density) of the item, i.e. z(t)dt is approximately the (unconditional) probability that a failure of the item occurs during (t, t + t) Example: For a repaired items with a constant failure rate of one failures per operating year and a required time of operation without failure of six months, the reliability is given by R(t, t + 6) = exp (-1 x 6/12) = 0,6065 Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 For repaired items with zero time to restoration the Mean Time To Failure is given MTTF = R(t)dt 0

When observed operating time to failures of n items are available, then an estimate of MTTF is given by MTTF = total operating time / kF Example: For a repaired items with a constant failure rate of 0,5 failures per year MTTF = 1/0,5/1 = 2 years = 17.520 h Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Consider: If a repaired item with zero time to restoration operates continuously, and if the times to failure are exponentially distributed three often used terms are equal MTTF = MTBF = MUT = 1/ MTTF Mean Time To Failure MTBF Mean Time between Failure MUT Mean Uptime Module 1: A few Definitions and Formalisms R&S Training Course

CERN, February 2002 Repaired items with non-zero time to restoration The reliability of a repaired item with non-zero time to restoration for the time interval(t1, t2) may be written as t1 R(t1, t2) = R(t2) + R(t2 t)(t)dt 0 where the first term R(t2) represents the probability of survival to time t2, and the second term represents the probability of restoration (after a failure) at time t(t < t1), and surviving to time t2 (t) is the instantaneous restoration intensity of the item When the times to failure are exponentially distributed, then R(t1, t2) = A(t1)exp(- (t2 t1)) where A(t1) is the instantaneous availability at time t1, and lim R(t, t + x) = [MTTF / (MTTF + MTTR)] exp(-t) t Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Repaired items with non-zero time to restoration

When times to failure and times to restoration are exponentially distributed, then, using either Markov techniques or the Laplace transformation, the following expression is obtained: R(t1, t2) = (R/( + R) +/(+ R)exp[-( + R) t1]exp[- (t1 t2)] and lim R(t, t + x) = R /(+ R) exp(-x) t Example: For a item with = 2 failures per operating year and a restoration rate of R = 10 restorations per (restoration) year, and x = 1/4 lim R(t, t + 1/4) = 10/12 exp(-2 x 1/4) = 0,505 t Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Repaired items with non-zero time to restoration We can define a asymptotic mean availability A of an item A = lim A (t1, t2) = A = MUT / (MUT + MTTR) t2

where MTTR Mean Time to Repair Example: For a continuously operating item with a failure rate of = 2 failures per operating year and a restoration rate of R = 10 restorations per year then A = (0, ) = 10/12 + 2/144 {[(exp(-12 x 0) exp(-12 x )] / - 0} = 0,886 = (0, 1) = 0,833 Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Additional Formulas see e.g. in the following Textbooks (random sample of useful books) Birolini, A.,

Quality and Reliability of Technical Systems; Springer 1997 2nd Edition; ISBN 3-540-63310-3 Hoyland A., & Rausand, M., System Reliability Theory; John Wiley & Sons; 1994; ISBN 0-471-59397-4 Modarres, M., Reliability and Risk Analysis; Marcel Dekker, Inc. NY; 1993, ISBN 0-8247-8958-X Schrfer, E., Zuverlssigkeit von Me- und Automatisierungseinrichtungen; Hanser Verlag, 1984, ISBN 3-446-14190-1 Knezevic, J., Systems Maintainability, Chapman & Hall, 1997, ISBN 0 412 80270 8 Lipschutz, S., Probability, Schaums Outline Series, McGraw-Hill Book Company, 1965, ISBN 07-037982-3 IEC 61703, Ed 1: Mathematical Expressions for Reliability, Availability, Maintainability and Maintenance Support Terms, 1999 http://www.dke.de Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Types of Maintenance Maintenance Preventive Maintenance

Time dependant Condition dependant Corrective Maintenance Reliability centred Intersection of Theory und Practice Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Maintainability Measures Probability of Task Completion: PTCDMT = P(DMT Tst) = 0 Tst m(t)dt Tst .stated time for task completion m(t)probability density function of DMT Mean Duration of Maintenance Task:

MDMT = E(DMT) = 0 t x m(t)dt E(DMT) expectation of the random variable DMT Module 1: A few Definitions and Formalisms R&S Training Course CERN, February 2002 Maintenance and the Exponential Distribution m(t) = (1 / Am ) exp (t / Am) , t > 0 In case of exponential probability distribution: m(t) = P(DMT t) = 1 exp (t / Am) DMT.Duration of Maintenance Task Am.Scale parameter of the exp. distribution = MDMT Example: On average it takes 10 days to restore a specific machine; find the chance that less than 5 days will be enough to successfully complete the restoration: Solution: m(t) = (1/10) exp - (t / 10) and P(DMT) 5 = M(5) = 1 - exp 5/10 = 1 - 0,61 = 0,39 Module 1: From Components to Systems R&S Training Course

CERN, February 2002 We have to recall some Basic Laws of Probability A and B are mutually exclusive events than the probability that either of them occurs in a single trial is the sum of their probability Pr{A + B} = Pr{A} + Pr{B} If two events A and B are general, the probability that at least one of them occurs is: Pr{A + B} = Pr{A} + Pr{B} Pr{AB} Two events, A & B, are statistically independent if and only if Pr{AB} = Pr{A} Pr{B} Bayes Theorem Pr{AiB} = PR{Ai} Pr{BAi} / [ i Pr{BAi} Pr{Ai}] More see in e. g. Schaums Outline Series [Seymour Lipschutz]: Theory and Problems of Probability, McGRAW-HILL Book Company Module 1: From Components to Systems MTBF = 80 = 1/80 = 0,0125 R = 0,9 MTBF = 80 = 1/80 = 0,0125 R = 0,9 R&S Training Course CERN, February 2002

RS = R i n Serial System We know R(t) = e . t = 1 Q(t) Q(t) = 1 R(t) Qav ~ . t / 2 = 1 / MTBF [h-1] MTBF = Operational Time / Number of Stops MTTR = Sum of Repair Time / Number of Repairs For the System we yield: S = = 0,0125 + 0,0125 = 0,025 1/h MTBFS = 1/(1/MTBF + 1/MTBF) =1/(1/80 + 1/80) = 40 h RS = R x R = 0,9 x 0,9 = 0,81 QS = Q + Q (Q x Q) = 0,1 + 0,1 0,01 = 0,19 = 1 - 0,81 Module 1: From Components to Systems MTBF = 80 = 1/80 = 0,0125 R = 0,9 R&S Training Course CERN, February 2002

RS = 1 - (1 - Ri)n Parallel System MTBF = 80 = 1/80 = 0,0125 R = 0,9 We know R(t) = e . t = 1 Q(t) Q(t) = 1 R(t) Qav ~ . t / 2 = 1 / MTBF [h-1] MTBF = Operational Time / Number of Stops MTTR = Sum of Repair Time / Number of Repairs For the System we yield S= 2 / 3 = 0,0083 1/h MTBFS = 80 + 80 1/(1/80 + 1/80) = 120 h RS = 1 - [(1 - R) x (1 - R)] = 1 (1 - 0,9) x (1 0,9) = 0,99 RS = R + R R x R = 0,9 + 0,9 0,9 x 0,9 = 0,99 QS = Q x Q = 0,1 x 0,1 = 0,01 Module 1: From Components to Systems 0,95 CERN, February 2002

0,97 0,99 Mixed System 0,98 0,99 R&S Training Course 0,90 For the System we yield RS = 1 [(1 0,95)(1 0,99)] x 0,98 x {1 [(1 0,99) x 0,97 x (1- 0,90)]} = 0,9995 x 0.98 x 0,99603 RS = 0,97562 ~ 0,97 The Unreliability QS = 1 R = 0,02438 ~ 0,03 Module 1: From Components to Systems R&S Training Course CERN, February 2002

Nowadays we calculate Reliability Characteristics by the means of commercial PC programs like: Cafta (USA) Care (Israel) Item Software (UK) Isograph (UK) Relex (USA) Risk Spectrum (S) Saphire (USA) For further information look for Software presentations at ESREL Conference Sites, e.g. ESREL99; ESREL2001, ESREL2002 Module 1: Important Methods R&S Training Course CERN, February 2002

FMEA Principle: it represents a qualitative structure what can be happen why Cause Failure Mode of Item Mode 1-n Effect Nowadays a Semi Quantitative Procedure using a three parameter grading system RPZ Module 1: Important Methods R&S Training Course

CERN, February 2002 Using Failure Rates we can perform the Fault Tree quantification Cooling fails Fault Tree Principle: A qualitative structure how the system fails or No Impulse Ventilator blocked or Motor do not start Switch do not close

large FTs consist of 5.000 Function Elements Module 1: Important Methods R&S Training Course CERN, February 2002 TOP Event TE = A + BD + BE + CD + CE Simplified Cut Set Example or BD+BE+CD+CE A and B+C D+E

or B or C D E IF A, B, C, D, E = O,1 than TE = 0,045 Module 1: Important Methods R&S Training Course CERN, February 2002 Using Probabilities we can perform the Event Tree Quantification Event Sequence Condition ESk2 Event Sequence

Condition ESk1 Initiating Event IEi e.g. Cooling Pipe Break p (yes) Event Tree Principle: it represents a qualitative structure what can be happen Plant Damage State, PDSj p (no) Large Event Trees consist of dozens of branches Module 1: Important Methods PSA Model An Integrative Model of Event Trees and Fault Trees

R&S Training Course CERN, February 2002 System Functions 1 - j Function Consequence: Failure Type Frequency IEs Initating Events 1-i Amount Fault Trees 1 - k OR Pump d.n.st. Valve d.n.op.

Basic Events Large PSA Models consist of a fifty Event Trees and a hundred of Fault Trees Module 1: Important Methods R&S Training Course CERN, February 2002 Markov Modelling / Chains Three Types: Homogeneous Continuous Time Markov Chain Non-Homogeneous Continuous Time Markov Chain Semi-Markov Models Pros - very flexible capability - good for repair - good for standby spares - good for sequence dependencies - Good for different type of fault coverage, error handling and recovery Cons - can require large number of states

- modelling is relative complex model often different from physical or logical organisation of the system Module 1: Important Methods R&S Training Course CERN, February 2002 Markov Modelling / Chains Simple Example Control System Two processors; 1 active, 1 hot backup Fault coverage may be imperfect c = pr {fault detected and recovery is successful given processor fault occurs} 1 c = pr {fault is not detected or recovery is unsuccessful given processor fault occurs} = Failure Rate = Repair Rate 2 2c 2(1-c)

F 1 Module 1: Important Methods pdf R&S Training Course CERN, February 2002 Structural Reliability (simplified one dimensional case) Safety Factor Stress Strength a measure for probability of failure N/mm Module 1: Common Cause Failures

R&S Training Course CERN, February 2002 Type of Failures of Items Random Failures Failure of a few Items Dependant Failures One has to model these different types ! Consecutive Failures Failures of identical Items Functional Failures Design

Failures Environm. caused Failures Module 1: Common Cause Failures R&S Training Course CERN, February 2002 The Boolean representation of a three component system considering Common Cause Failures (CCF) shows as following: AT = Ai + CAB + CAC + CABC AT.total failure of component A Ai..failure of component A from independent causes CAB..failure of component A and B (and not C) from common cause CAC..equivalent Module 1: Common Cause Failures R&S Training Course

CERN, February 2002 The simple single parameter model called factor model looks like Qm = . Qt = e.g. 0,1 that means in other words 10% of the unavailability of a system would be caused by common cause failures Some other models are shown in the next copy Module 1: Common Cause Failures R&S Training Course CERN, February 2002 Module 1: Human Factor Issues R&S Training Course CERN, February 2002 Human Factor Issues are massive involved in the R&S Technology Human Operator Reliability in control rooms Human Reliability in maintenance work Human Reliability in abnormal, accidental and emergency

conditions Man Machine Effectiveness Human Operators in control loop systems Ergonomics for control, supervision and maintenance of systems Module 1: Human Factor Issues R&S Training Course CERN, February 2002 HR Models of the first generation THERP (Techniques for Human Error Program) HCR (Human Cognitive Reliability Model) PHRA (Probabilistic Human Reliability Analysis) SLIM (Success Likelihood Index Method) Within THERP the so called HRA Action Tree represents the procedure used for estimating probabilities Ptot = A + (aB) Failure + (abCD)

+ (abCdE) Success +(abcE) Module 1: Human Factor Issues R&S Training Course CERN, February 2002 HR Models of the second generation ATHENA (NRC) CREAM (Halden) MERMOS (EDF) FACE (VTT) and many others. These models are more cognitive oriented as the first generation models The challenge nowadays is the estimation of HEPs for Errors of Commission For Errors of Omission a soundly based tool box and validated data are available Module 1: Software Issues R&S Training Course CERN, February 2002

Why Software Reliability Prediction (SRP) is needed? Amount & Importance of software is increasing Software accounts for approximately 80 % of switch failures Software reliability is not improving fast Software is costly to fix Motivation, pressure and number of experts for doing SRP is limited Basic Questions in SRP: At what rate do failures occur ? What is the impact of these failures ? When will faults be corrected ? Module 1: Software Issues R&S Training Course CERN, February 2002 Important Definition Failure.

An event in which the execution of a Failure software system produces behaviour which does not meet costumer expectation (functional performance) FaultThe part of the software system which must Fault be repaired to prevent a failure. Module 1: Software Issues R&S Training Course CERN, February 2002 If we have an observed data example we can calculate (t) (failure intensity/rate) if a Logarithmic Poisson Distribution is suitable: (t) = a / (b t + 1) The parameters to be estimated are a and b For that we need the likelihood function or the probability that the observed data occur: L(data) = j Pr{yj failure in period j} Module 1: Software Issues

R&S Training Course CERN, February 2002 Example: Period j System Month tj Number of Failures yj 1 23 55 2 52 62 3 89

47 4 137 52 5 199 56 6 279 42 7 380 47 8

511 49 Module 1: Software Issues R&S Training Course CERN, February 2002 Example: Parameter estimates: a = 2,93; b = 0,016 (t) = 2,93 / (0,016 t + 1) Thus: Estimates of failure intensity at 1.000 system month: (t) = 2,93 / (0,016 x 1000 + 1) = 0,17 failures per system month Estimate the mean cumulative number of failures at 5.000 system month: 2,93 / 0,016) ln (0,016 t +1) = 2,93 / 0,016) ln (0,016 x 5.000 +1) = 805 failures Todays References [IEC 61508; Belcore Publications plus Handout] Module 1: Types of Uncertainties R&S Training Course

CERN, February 2002 Within the process of R&S we have to be aware about - at least - three type of uncertainties Parameter uncertainties (aleatory uncertainties) Model uncertainties (epistemic uncertainties) Degree of completeness Problems and unresolved issues performing an uncertainty assessment increases as this sequence But some information about uncertainties is better than nothing Module 1: Some Standards

IEC 300 IEC 605 IEC 706 IEC 50(191) IEC 1014 IEC 1025 IEC 1070 IEC 1078 IEC 1123 IEC 1160 IEC 1146 IEC 1165 IEC 61508 R&S Training Course CERN, February 2002 Dependability Management Equipment Reliability Testing

Guide to the Maintainability of Equipments Procedure for Failure Mode and Effect Analysis (FMEA) Programmes for Reliability Growth Fault Tree Analysis (FTA) Compliance Test Procedure for Steady State Availability Reliability Block Diagrams Reliability Testing Formal Design Review Reliability Growth Models and Estimation Methods Application of Markov Methods Functional safety of electrical/electronic/programmable electronic safety related systems Others for Reliability Issues: CENELEC, IEEE, ISO, MIL, ASME, etc. Module 2: Systems Reliability towards Risk Informed Approach R&S Training Course CERN, February 2002 Content:

Systems Reliability towards Risk Informed Approach Anatomy of Risk Some Definitions Living Models Reliability Growth Management Risk Monitoring How Safe is Safe Enough? R&S Training Course Module 2: Systems Reliability towards Risk Informed Approach CERN, February 2002 Trial and Error (past) Historical Experience Plant Model Living Process

(modern) Future Behaviour Module 2: Systems Reliability towards Risk Informed Approach R&S Training Course CERN, February 2002 Deterministic in System Reliability Design Process: based on pre-defined rules and criterions derived from experiences Calculation Process: based on determined laws and formulas, calculating point values Review Process: check of the compliance with rules and standards Decision Making Process: yes / no - go / no go answers based on rule compliance Module 2: Systems Reliability towards Risk Informed Approach R&S Training Course

CERN, February 2002 Probabilistic in System Reliability Design Process: based on pre-defined rules and criterions based on experiences plus probabilistic goals and targets Calculation Process: based on determined laws and formulas plus uncertainties and random variables, calculating distribution functions Review Process: check of the compliance with rules and standards plus check of the compliance with the goals and targets Decision Making Process: yes / no - go / no go answers based on risk insights Module 2: Systems Reliability towards Risk Informed Approach R&S Training Course CERN, February 2002 In the Deterministic Approach we use formalisms derived from best practice and fitted with single point values as a first guess The Real World do not follow that formalisms based on single point values. Practical all values required show

spreads (uncertainties) and / or a stochastic behaviour Therefore exists a challenge for modern analysis techniques and numerical solutions (e.g. Simulations) I advocate for the extension from deterministic approach towards probabilistic models to consider the stochastic behaviour and the uncertainties Module 2: Systems Reliability towards Risk Informed Approach R&S Training Course CERN, February 2002 Probabilistic towards Risk Informed Approach PROS it is an extension of the deterministic basis it is supported quantitatively by historical experiences it models determined, random and uncertain elements it is quantitative and therefore appropriate for sensitivity, importance and optimisation studies

it integrates design, manufacturing and operational aspects it integrates various safety issues and allows rankings it shows explicit vagueness and uncertainties CONS relatively new, more complex, and not well understood larger projects, harder to get financial support harder transformation of results into yes or no decisions R&S Training Course Module 2: Systems Reliability towards Risk Informed Approach Basic Events Fault Tree Causes CERN, February 2002 Event Tree Initiating Event

Bow Tie Logic Consequences Fault Tree Basis Events R&S Training Course Module 2: Systems Reliability towards Risk Informed Approach LHC Plant Damage States and their Frequencies CERN, February 2002 Identification of IEs Event Tree Analysis System Response Analysis System and Reliability Analysis

Fault Tree Analysis Human Factor Analysis Common Cause Analysis Systems Reliability Characteristics Database Generation Module 2: Anatomy of Risk R&S Training Course CERN, February 2002 Risk of a Plant Type, Amount and Frequency of Consequences Consequences Release Parameter Receptor Parameter Frequencies

IEs Frequencies Conditional Probabilities Classical Decomposition Module 2: Some Definitions R&S Training Course CERN, February 2002 Reliability Insights generated by Importance Measures Fussel-Vesely = [PR{top} Pr{topA = 0}] / Pr{top} Weighted fraction of cut sets that contain the basic event Birnbaum = Pr{topA = 1} Pr{topA = 0} Maximum increase in risk Associated with component A is failed to component A is perfect Risk Achievement worth = Pr{topA = 1} / Pr{top} The factor by which the top probability (or risk) would increase if component A is not available (not installed) Risk Reduction Worth = Pr{top} / Pr{topA = 0} The factor by which the risk would be reduced if the component A were made perfect

Module 2: Living Models R&S Training Course CERN, February 2002 In R&S we have to learn permanently from the past; that means it is an ongoing, never ending process, we call it Living Process It is strongly recommended to establish and the store all the models and data with the means of computerised tools This helps to manage in a more efficient way three important issues System Changes Personal Changes Increasing State of Knowledge Module 2: Reliability Growth Management R&S Training Course CERN, February 2002 Basic Structure Management Testing

Failure Reporting, Analysis and Corrective Action System (FRACAS) During Test we observe Type A modes (not fixed) Type B modes (fixed) At beginning of the test operation i = A +B Effectiveness Factor EF inh = A + (1 - EF) A (more details for growth models see MIL-HDBK-189) Module 2: Risk Monitoring R&S Training Course CERN, February 2002 Statistical Data Current Situation System Information PSA

Modell What Happens If Reliability Information Module 2: Risk Monitoring R&S Training Course CERN, February 2002 Reliability Characteristic Abnormal Event Relining Unknown Level without calculation Risk Profil of a plant Base Line

Operational Time Module 2: How Safe is Safe Enough? R&S Training Course CERN, February 2002 Typical way of Thinking Risk unjustifiable Intolerable ALARP Tolerable only if reduction impracticable or cost grossly disproportionate Benchmark Tolerable if cost of reduction exceeds improvement Broadly

Acceptable Maintain assurance that risk is at this level Module 2: How Safe is Safe Enough? R&S Training Course CERN, February 2002 List of important qualitative Risk Characteristics related to Tolerability of Risk Qualitative Characteristics Direction of Influence Increase Risk Tolerance

Depends on Confidence Increase Risk Tolerance Increase Risk Tolerance Decrease Risk Tolerance Depends on Individual Utility Amplifies Risk Awareness Increase Quest for Social and Political Response Personal Control Institutional Control Voluntariness Familiarity Dread Inequitable Distribution Artificiality of Risk Source Blame Module 2: How Safe is Safe Enough? Z The Netherlands R&S Training Course CERN, February 2002 Z

Canada Z UK 1 IR < 10-6 Housing, schools, hospitals allowed i IR < 10-6 Every activity allowed A PED < 10-6 Insignificant risk area 2 10-6 < IR < 10-5 Offices, stores, restaurants allowed ii 10-6 < IR < 10-5 B 10-6 < PED < 10-5 Commercial activity

Risk assessment only required 3 IR > 10-5 Only by exemption iii 10-5 < IR < 10-4 Only adjacent activity (new establishments) C PED > 10-5 High risk area iv IR > 10-4 Forbidden area Risk Contours in Land Use Planning (z-Zone) [Okstad; ESREL01] Module 2: How Safe is Safe Enough? Type of Exposure

R&S Training Course CERN, February 2002 Individual Risk (D) number per mio persons and year Cancer Pneumonia Mining Suicide Motorcar traffic Industrial work Chemical industry work Electrical current Lightning 0,1 1 10 100 1000 Module 3: The ideal R&M Process for Large Scale Sys

R&S Training Course CERN, February 2002 Content: The ideal Process Anatomy of Risk From R&S Goals via the Implementation into the System to the Proof of the Compliance Constrains and Problems Implementing an ideal Process R&S Training Course Module 3: CERN, February 2002 The ideal Process The ideal R&S process consists (simplified) of four main elements: Establishment of the Risk Policy Evaluation and Assessment of the Risk Concerns Performing Risk Control To do Decision Making The process is highly intermeshed and iterative! and multi-disciplinary

Module 3: The ideal Process R&S Training Course CERN, February 2002 To make this ideal process useful for application we need quantitative Safety Risk / Goal witch is tolerable by the society. There is trend to use as orientation for this Goal the so called Minimal Endogen Mortality (MEM Value) which is the individual risk for young people to die per year This MEM value is given in most of the countries at al level of 2 x 10-4 per person year Based on this number some experts advocate for a Global Individual Risk Goal for Hazardous Installations at a level of 10-5 per person year. Module 3: The ideal Process R&S Training Course CERN, February 2002 List of important qualitative Risk Characteristics related to the Tolerability of Risk Qualitative Characteristics

Direction of Influence Increase Risk Tolerance Depends on Confidence Increase Risk Tolerance Increase Risk Tolerance Decrease Risk Tolerance Depends on Individual Utility Amplifies Risk Awareness Increase Quest for Social and Political Response Personal Control Institutional Control Voluntaries Familiarity Dread

Inequitable Distribution Artificiality of Risk Source Blame Module 3: The ideal Process R&S Training Course CERN, February 2002 The ideal process integrates design, construction, and operational parameters from the system, the operator and the environment. The process is plant wide and comprehensive As a consequence we need for at least the analysis of hardware, software, paperware and the operator behavior The analysis of hardware is reasonably established The analysis of operator behavior is reasonably established The analysis of paperware is reasonably established The analysis of software is not well established Module 3:

Anatomy of Risk R&S Training Course CERN, February 2002 Three main Elements (Anatomy) of Risk: what can go wrong ? how frequent is it ? what are the consequences ? Consensus across Technologies these elements describe in a most complete form the real world

as larger the consequences as smaller the frequencies should be Unresolved issue across Technologies how safe is safe enough - tolerability of risk Module 3: From Goals towards Compliance R&S Training Course CERN, February 2002 Global Goal For the LHC Environment Level Plant Level System Level Specific Targets Component Level Top-Down for Targets

Bottom-up for Compliance Module 3: From Goals towards Compliance R&S Training Course CERN, February 2002 The allocation of local targets derived from a global goal for LSS is analytically not possible. It is multi parameter problem. Therefore some simplifications of the problem were developed. One of them is the so-called AGREE Allocation [US MIL HDBK338]. It works primarily for serial systems j = nj [ - log(R(T))] / (EjtjN) R(tj) = 1 {1 [R(T)]nj/N} / Ej with R(T) nj , N T tj system reliability requirement number of modules in (unit j, system) time that the system is required to operate time that unit j is required during T Module 3: From Goals towards Compliance

R&S Training Course CERN, February 2002 Basic Events Allocation Fault Tree Causes Event Tree Initiating Events Bow Tie Logic Consequences Fault Tree Basis Events Proof / Review Module 3: From Goals towards Compliance

R&S Training Course CERN, February 2002 For the allocation of local targets a linear partition to all the considered initiating events (IEs) should be used as a first approximation The allocated targets to the IEs should be subdivided also linear for all the system function modules relevant for that IE This liner allocation could be realised by spread sheet programming Commercial programs use a Simulation procedure applied to the system topology Module 3: From Goals towards Compliance R&S Training Course CERN, February 2002 For the proof of the global target all the frequencies calculated for similar consequences have to be summed up Commercial programs realise fault tree linking based on the identified event trees to do these summation process computerised

Module 3: From Goals towards Compliance R&S Training Course CERN, February 2002 Allocation of MTTR [British Standard 6548] for New Designs MTTRi = (MTTRs x 1k ni i ) / kni i where MTTRi is the target mean active corrective maintenance time (or mean time to repair) for the a system with k consisting items The Linear Programming Method proposed by Hunt (92, 93) using different constraints produces more realistic MTTRs. The method permits better system modelling, different repair scenarios, trade offs, data updating. etc. Module 3: From Goals towards Compliance R&S Training Course CERN, February 2002 Allocation of MTTR [British Standard 6548] for New Designs Example: MTTR based on BS 6584 versus LP (MTTRs 30min; MTTRmin 5 min; MTTRmax 120 on average) Item

n (10-3) nx MTTR MTTR Unit A 1 0,3430 0,3430 10,93 17,63 B 1 0,2032

0,2032 18,45 29,76 C 1 0,1112 0,1112 33,72 54,38 D 1 0,2956 0,2956

12,69 20,46 E 1 0,0439 0,0439 85,42 123,95 F 1 0,0014 0,0014 2.678,57 120,00

G 1 0,0001 0,0001 37.500,00 120,00 H 1 0,0016 0,0016 2343,75 120,00 Module 3: P&Cs Performing the ideal Process

Fault Trees Hybrid Hierarchical R&S Training Course CERN, February 2002 Simulation Increasing Complexity of LSS Digraphs Dynamic Fault Trees Markov Models Trade-off for selecting methods: Simplicity versus Flexibility The Place of Various Modelling Techniques for LSSs Module 4: Some Applications of R&S on LHC

R&S Training Course CERN, February 2002 Content: Where We Are Similarities and Differences in R&S Master Logic Anatomy of Risk Decomposition and Aggregation of the System Cause - Consequence Diagram Module 4: Where We Are R&S Training Course CERN, February 2002 ACCELERATOR SYSTEMS RELIABILITY ISSUES Burgazzi Luciano ENEA, Bologna Via Martin di Monte Sole, 4 40129 Bologna Tel. 051 6098556 Fax 051 6098279 Email: burgazzi~bologna.enea.it ABSTRACT

In the lastyears lt has been recognized the needfor investigation into the reliability of accelerator systems. This requirement results ftom new applications of accelerators (e.g. High Power Proton Accelerators for Accelerator Production of Tritium and Accelerator Transmutation of Wastes, International Fusion Materials Irradiation Facility) demanding high availability and reliability. Atpresent, although a sign~ficant history ofaccelerator operation has been accumulated over the past 50 or so years, there is a deficiency in re/iability estimates of accelerator systems due to thefact that the reliability is not a major topic asfar as most existing acceleratorsfor scient~fic experiments, in thefield ofhigh energy physics, are concerned. At the moment, despite the fact that standard reliability tools are suitable for accelerator reliability mode/, no formal reliability database for major accelerator components (such as Ion source, RE systems, etc.) 15 available, being evident that the only available data (in terms ofmean time between failures and mean time between repairs) may be inferred by the analysis of existing facilities operational experience information, leading consequently to a large uncertainty in the results (i.e. high EE, iflognormal distributions are assumed). Therefore an activity aimed at continued data collection, continued statistical inference analysis and development of mode/ing approaches for accelerators is envisaged in the next future. The present paper intends to highlight the main issues concerning the reliability assessment of accelerator machin es, focusing on the state of the art in this area and suggesting future directions for addressing the issues. In particular the topic is discussed referring mainly to Acce/erator-Driven Reactor System concept, on which the effort ofseveral research organization isfocused aiming at its development. Continued research and methodology development are necessary to achieve the future accelerator system design with characteristics satisfring the desired requirements, in terms of availability and safety. Module 4: Where We Are R&S Training Course CERN, February 2002

REFERENCES [1] F. E. Dunn, DC. Wade Estimation of thermal fatigue due to beam interruptions for an ALMR-type ATW OFCD-NEA Workshop on Utilization and Reliability of High Power Proton Accelerators, Aix-en-Provence, France, Nov.2224, 1999 [2] L.C. Cadwallader, T. Pinna Progress Towards a Component Failure Rate Data Bank for Magnetic Fusion Safety International Topical Meeting on Probabilistic Safety Assessment PSA 99, Washington DC (USA), August 22-26 1999 [3] C. Piaszczyck, M. Remiich, Reliability Survey of Accelerator Facilities, Maintenance and Reliability Conference Proceedings, Knoxville (USA), May 12-14 1998 [4] C. Piaszczyck, Operational Experience at Existing Accelerator Facilities, NEA Workshop 011 Utilization and Reliability of High Power Accelerator, Mito (Japan), October 1998 [5] VI. Martone, IFMIF Conceptual Design Activity Final Report, Report ENEA RTERG-FUS-96-1 1(1996) [6] C. Piaszczyck, M. Rennieh Reliability Analysis of IFMIF 2nd International Topical Meeting on Nuctear Applications of Aceelerator Technology ACCAPP 98, Gatlinburg (USA), September 20-23 1998 [7] L. Burgazzi, Safety Assessment of the IFMIF Facility, doc. ENEA-CT-SBA-00006 (1999) [81 C. Piaszczyck, M. Eriksson Reliability Assessment of the LANSCE Accelerator System 2d International

Topical Meeting on Nuelear Applications of Accelerator Technology ACCAPP 98, Gatlinburg (USA), September 20-23 1998 [9] L. Burgazzi,Uncertainty and Sensitivity Analysis on Probabilistic safety Assessment of an Experimental Facility 5th International Conference on Probabilistic safety assessment and Management Osaka (Japan) Nov. 27-Dec 1,2000. Module 4: Where We Are R&S Training Course CERN, February 2002 Component [from Burgazzi, ESREL2001] Ion Source rf Antenna Ion Source Extractor Ion Source Turbomech Vac Pump LEPT Focussing Magnet LEBT Steering Magnet DTL Quadrupole Magnet DTL Support Structure DTL Drive Loop DTL Cavity Structure High Power rf Tetrode Circulator Rf Transport Directional Coupler

Reflectometer Resonance Control Solid State Driver Amplifier 6,0 E-3 1,0 E-5 5,0 E-5 2,0 E-6 2,0 E-6 1,0 E-6 2,0 E-7 5,0 E-5 2,0 E-7 1,0 E-4 1,0 E-6 1,0 E-6 1,0 E-6 1,0 E-6 1,0 E-5 2,0 E-5 Module 4: Where We Are R&S Training Course CERN, February 2002

Results of Reliability Studies at LANSCE Accelerator [from Burgazzi, ESREL2001] Main System Subsystem MDT [h:min] MTBF [h:min] 805 RF Klystron Assembly 0:44 11560 High Voltage System 0:18 960 DC Magnet

0:53 232280 Magnet 0:50 8445 Harmonic Puncher 0:09 44 Chopper Magnet 0:08 291 Deflector Magnet 0:10 684

Kicker Magnet 1:58 557 Water System Water Pump 0:29 29506 Vacuum System Ion Pump 0:29 25308 Magnet Focusing Supplies Pulse Power

Module 4: Similarities and Differences in R&S R&S Training Course CERN, February 2002 It makes a difference analysing for Reliability of LHC or for Safety. But many elements and parts of analysis are common For R we look mainly for failures in operational systems For S we look after occurring an initiating event for failures in stand-by (safety) systems In other words: R.what is the probability of loss of function of LHC S.what is the probability of a given damage (consequence) at the LHC Module 4: Similarities and Differences in R&S R&S Training Course CERN, February 2002 R for Reliability of LHC is mainly involved in Systems Operational Effectiveness Function

Performance Functionability Attributes Technical Effectiveness Reliability Maintainability Supportability Availability Operation Maintenance Logistics Operational Effectiveness Module 4: Similarities and Differences in R&S Basic Events Fault Tree

R&S Training Course CERN, February 2002 Event Tree Initiating Events Causes Domain of R Consequences Domain of S Module 4: Master Logic R&S Training Course CERN, February 2002 Analysing R we have to look first which system functions are needed for the function of the entire LHC

The opposite of the function R answers for the malfunction Q (unavailability Q = 1 - R) of the LHC To answer the question: which system functions are needed the so-called Master Logic is an appropriate tool and a way of thinking In the next slide a simplified example, but for training we should expand it using an excel sheet Module 4: Master Logic R&S Training Course CERN, February 2002 Master Logic Simplified Principle LHC Function + Particle + Magnetic Field

+ Vacuum - Cooling Medium Pressure Flow IT type of graphics (trees) are very convenient to design and to show a large Master Logic Pump Module 4: Anatomy of Risk R&S Training Course CERN, February 2002 Analysing S we have to look first which Type of Risks we have to evaluate. This is strongly dependent from the so-called hazard potential To answer the question:

which type of risks we have to evaluate the so-called Anatomy of Risk is an appropriate tool and a way of thinking In the next slide a simplified example, but for training we should expand it using an excel sheet Module 4: Anatomy of Risk R&S Training Course CERN, February 2002 Risk of the LHC Plant Type, Amount and Frequency of Consequences Consequences Release Parameter Receptor Parameter Frequencies IEs Frequencies

Conditional Probabilities Classical Decomposition R&S Training Course Module 4: Decomposition and Aggregation of the System Decomposition CERN, February 2002 Aggregation Down to Component Level Up to System Function Level Different way No Flow Cooling successful of thinking

Loss of Medium Pump do not run Medium available Flow functioning Power Supply failure Mechanical failure Pressure functioning Temperature functioning Module 4: Cause Consequence Diagram Basic Events

Fault Tree Causes R&S Training Course CERN, February 2002 Event Tree Initiating Events Consequences Fault Tree Bow Tie Logic is appropriate for S Basis Events Bast Seminar Module 4: Identification von IEs 23.10.2002 Bergisch Gladbach

Task: Identification of Initiating Events (IEs), which can lead at the end of an event sequence to a plant damage state. Method: Master Logic Diagram Operational Experience Analysis Logic: IE1 Master Logic Diagram IE2 Plant Damage State PDS Module 4: Event Sequence Analysis Bast Seminar 23.10.2002 Bergisch Gladbach Task: Identification of Event Scenarios and the related techn/physical parameters which can lead at the end of

the event sequences to a plant damage state. Method: Event Tree, System Response Analysis Operational Experience Analysis Logic: Event Tree Diagram PDS1 IEi PDS2 Bast Seminar Module 4: PDS Frequencies 23.10.2002 Bergisch Gladbach Task: Evaluation of the plant damage states frequencies at the end of all the different event sequences Method: Event Sequence Analysis, Fault Tree Analysis Operational Experience and Data Generation

Formalism: f(PDS) = f(IE) . p(IE --> PDS) Analysis Logic: Event Tree PDS1 IEi Fault Tree PDS2 BE1 BE2 Module 4: Source Term Analyse Bast Seminar 23.10.2002 Bergisch Gladbach Task: Evaluation of type, amount and frequency of possible releases of harmful material and classification into release categories (STGs)

Method: Event Sequence Analysis, Fault Tree Analysis System Response Analyse Operational Experience and Data Generation Formalism: f(STG) = f(PDS) . p(PDS --> STG) Analysis Logic: IEi PDS1 PDS2 STG1 STG2 Module 4: Consequence Model Bast Seminar 23.10.2002 Bergisch Gladbach Task: Evaluation of type, amount and frequency of the various

possible consequences around a plant Method: Event Sequence Analysis, Fault tree Analysis, Source Term Analysis, Dispersion Modelling Operational Experience and Data Generation Formalism: f(C) = f(STG) . p(STG --> C) C1 Analysis Logic: STG1 C2 IEi STGI Bast Seminar Module 4: Risk Model 23.10.2002 Bergisch Gladbach Task: Evaluation of the Risk Parameters for the involved Persons Method: Dose Response Modelling, Population Modelling Data Analysis Formalism:

R(C) = f(STG) . C(STG) R(C)Vector of the Risk Parameters per Year f(STG)Vector of the Frequency of a Source Term C(STG)..Matrix of Consequence Parameter under the Condition of a Release Category IEi STG1 STG2 C1 C2 Rj Module 5: Lessons Learned from Various Technologies R&S Training Course CERN, February 2002 Content: Success Stories and Pitfalls Constraints in Data and Methods Limitations per se Technologies such as Aviation, Space, Process, Nuclear,

Offshore, Transport Module 5: Success Stories and Pitfalls R&S Training Course CERN, February 2002 There is a consensus across technologies that we should know the main elements (Anatomy) of Risk: what can go wrong ? how frequent is it ? what are the consequences ?, and we should consider: as larger the consequences as smaller the frequencies should be These elements describe in a most complete form the real world

It exist the unresolved issue across technologies how safe is safe enough ? the tolerability of risk Module 5: Constraints in Data and Methods R&S Training Course CERN, February 2002 To model the Real World we have the transform the historical experience via methods and data into a prognosis for the future The data base is often sparse and limited We have to start with generic data, statistically improved by Bayesian technique, if more and more plant specific date will be available Methods should be tested by Benchmarks between independent expert teams Formal Expert Judgement procedures should be used if the evidence from the past related to the methods and the data is very limited Remember: as longer you would search in potential date bases as more reliable date you would identify Module 5: Limitations per se R&S Training Course

CERN, February 2002 Within the R&S process we have to be aware about - at least - three type of uncertainties Parameter uncertainties (aleatory uncertainties) Model uncertainties (epistemic uncertainties) Degree of completeness Problems and unresolved issues performing an uncertainty assessment increases with this sequence But some information about uncertainties is better than nothing Remember: in the Deterministic Approach we generate point values only Module 5: Situation in different Technologies R&S Training Course CERN, February 2002 Process Industry: large differences; from yes or no to risk based Offshore Industry: small differences; primarily risk-based Marine Structures: small differences; primarily risk-based Aviation: small differences; primarily risk-based Civil Engineering: differences; for specific structures riskbased Nuclear Industry: differences;tendency towards risk-based Transport: differences: tendency towards risk-based

Motor Car Industry: differences; tendency towards risk-based Space Industry: strong tendency towards risk-based Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 Why Events Occur (in 352 LERs, NPP; USA) Human Variability 50 [%] Work Place Ergonomics 25 Procedure not Following 28 Training 10 Task Complexity

5 Procedures 7 Communication 5 Changed Organisation 8 Work Organisation 28 Work schedule 10 Work Environment 8

Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 System Model versus Historical Experience (INEL; USA) Outage Frequency per Year for different Grid Systems 3,5 3 2,5 2 1,5 1 0,5 0 INEL SS#2 SS#3 SS#4 SS#5 SS#11 SS#10

Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 Informations related to the split of different causes of failures and their identification are useful Causes of Unnormal Occurencies 243 LERs, FRG, 1991 During Test/Maint Spontaneous Unknown Operating Conditions Manufaturing Design/Construction Op/Test/Rep/Maint Component/Part Defect 0 10 20 30

40 Percentage 50 60 70 Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 Hazard Rate from Test Runs [Campean, ESREL01] hj = Number of failures in current mileage band / mileage accumulated by all vehicles in current mileage band Hazard and Cumulative Hazard Rate Plot for Automotive Engine Sealing 35 30 Rate [x10-4]

25 20 15 10 5 0 0 to 5 5 to 10 10 to 15 15 to 20 20 to 25 25 to 30 30 to 40 40 to 50 50 to 60 Mileage [10 km] Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 The Volume and Importance of Maintenance in the Life Cycle of a System, e. g. Boeing 747; N747PA [Knezevic: Systems Maintainability, ISBN 0 412 80270 8; 1997] Been airborne 80.000 hours

Flown 60,000.000 km Carried 4,000.000 passengers Made 40.000 take-off and landings Consumed 1.220.000.000 litres of fuel Gone through 2.100 tyres Used 350 break systems Been fitted with 125 engines

Had the passenger comp. replaced 4 times Had structural inspections 9.800 X-ray frames of films Had the metal skin replaced 5 times Total maintenance tasks during 22 y 806.000 manhours Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 The Volume and Importance of Maintenance in the Life Cycle of a System, e. g. Civil Aviation [Knezevic: Systems Maintainability, ISBN 0 412 80270 8; 1997]

Between 1981 and 1985 19 maintenance-related failures claimed 923 lives Between 1986 and 1990 27 maintenance-related failures claimed 190 lives Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 Example Civil Aviation [Knezevic: Systems Maintainability, ISBN 0 412 80270 8; 1997] Safety demands expressed through the achieved hazard rates (1982 1991) for propulsion systems required by CAAM Hazard Hazard Rate High energy non-containment Uncontrolled fire Engine separation Major loss of trust control 3,6 x 10-8 per engine hour

0,3 x 10-8 per engine hour 0,2 x 10-8 per engine hour 5,6 x 10-8 per engine hour Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 If we have good (hard) statistical data then we should use it e.g. for traffic accidents normally exist good statistics. Thus, for RIDM we should use these data base [bast Heft M95; Risikoanalyse des GGT fr den Zeitraum 87-91 fr den Straengternahverkehr (GVK) und fr den Benzintransport, D] Accidents (GVK) Number 89 Driving Performance (GVK) mio.Vehiclekm 416,2

Accident Rate(GVK) Accidents/ mio.Vehiclekm 0,214 Accident Rate (GVK) Accidents / mio.Vehiclekm 214 x 10-9 Accident Rate 0 -100 l Accidents / mio.Vehiclekm 72,76 x 10-9 Accident Rate 110 10.000 l Accidents / mio.Vehiclekm 109,14 x 10-9 Accident Rate >10.000 l Accidents / mio.Vehiclekm

32,10 x 10-9 Gasoline Transport Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 If we have good (hard) statistical data in Handbooks then we should use it (see also [Birolini; Springer 1997, ISBN 3-540-63310-3]) MIL-HDBK-217F, USA CNET RDF93, F SN 29500, DIN 40039 (Siemens, D) IEC 1709, International EUREDA Handbook, JRC Ispra, I Bellcore TR-332, International RAC, NONOP, NPRD; USA NTT Nippon Telephone, Tokyo, JP IEC 1709, International T-Book (NPP Sweden) OREDA Data Book (Offshore Industry) ZEDB (NPP Germany) Module 5:

Examples from different Technologies R&S Training Course CERN, February 2002 Societal Risk of reference tunnel; RT, RT no ref.doors, RT ref. doors 50m from [D.de.Weger, et al, ESREL2002, Turin] Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 There is a consensus across technologies that we should know the main elements (Anatomy) of Reliability: what can go wrong ?

how frequent is it ? what are the consequences ?, and we should consider: as larger the consequences (e.g. costs) as smaller the frequencies should be These elements describe in a most complete form the real world It exist the unresolved issue across technologies how reliable is reliable enough ? what is the most beneficial plant over time? Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 R&S has a long and successful story in industrial application The Deterministic Approach is a good basis for Safety Cases Nowadays new need an extension towards the Probabilistic Approach to model the Real World in a more realistic manner

A Risk Informed Decision Making Process (RIDM) should take place for all the safety concerns in the society Matured methods, tools and experienced experts, working since years in this field, are available and willing to help for dissemination of this RIDM process into practice The RIDM process can be used for all type of facilities Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 In the following Periodicals examples are published from different technologies Reliability Engineering & System Safety (RESS) Elsevier; http://www.elsevier.com/locate/ress IEEE Transactions on Reliability, published by IEEE Reliability Society ISSN 0018-9528 Qualitt und Zuverlssigkeit; published by DGQ, Germany Carl Hanser Verlag; http://hanser.de

Module 5: Examples from different Technologies R&S Training Course CERN, February 2002 At the following Conference Series you would get plenty R&S Informations ESREL Annual Conference Series PSAM Conference Series (every two years) RAMS Annual Conference Series SRA Annual Conference Series ICOSSAR Conference Series (every 4 years) OMEA Conference Series NASA & ESA Conferences on Risk and Reliability Plus specific Human Factor and Software Reliability Conferences, e.g. IFAC and ENCRESS

Some Key Words Availability Case Cause Consequence Event Event Tree Example Failure Mode Failure Rate Fault Tree FMEA Initiating Event Maintainability Maintenance Minimal Cut Set Probability Reliability Result Risk Safety Solution Time Verfgbarkeit Fall

Ursaache Auswirkung Ereignis Ereignisbaum Beispiel Fehlerart Ausfallrate Fehlerbaum Fehler-Mglichkeits- und Auswirkungsanalyse Auslsendes Ereignis Instandhaltbarkeit Instadhaltung Minimale Schnittmenge Wahrscheinlichkeit Zuverlssigkeit Ergebnis Risko Sicherheit Lsung Zeit R&S Training Course CERN, February 2002 Thats All R&S Training Course

CERN, February 2002 Thank you very much for your attention and the patience to follow all my presented issues

## Recently Viewed Presentations

• Image Quality: PET MPI vs. SPECT MPI. Figure 1. Image quality scores for PET and SPECT perfusion and ECG-gated scans. 1. Bateman, et. al. J NuclCardiol2006 Jan-Feb; 13(1):24-33. This was a study published in the Journal of Nuclear Cardiology by...
• Software Defined Networking:Mininet Tutorial. James Won-Ki Hong. Department of Computer Science and Engineering. POSTECH, Korea. [email protected]
• Past questions2016. Sue is a sole trader whose business is growing rapidly as sales are increasing.As a result of the growth, she needs to purchase stock worth \$10 000. (a) Explain a potential conflict between a short-term and a long-term
• The MARCS System. 800 MHz trunked, digital radio system. 800 MHz. is a frequency spectrum that offers very clear short range communications. Digital . means when you speak into the radio your voice is turned into to binary bits (1's...
• Title: Slide 1 Author: Anne Last modified by: ALICIA HUDAK Created Date: 1/13/2012 10:03:31 PM Document presentation format: On-screen Show (4:3) Company
• Cold phases of D-O cycles - smaller amplitude IRD peaks; smaller SST signal N. Atlantic SSTs track Greenland Air Temperatures H-events occurred during extreme cold phases of D-O cycles in Greenland ice cooler warmer Hartmut Heinrich discovered a cyclic pattern...
• Wolfgang Wildgen The Evolution of Meaning and Discourse Cognitive Science Case Western Reserve University, 3rd of October 2007 Contents 1. Introduction: Catastrophic transitions in the evolution of life 2.
• VETS campus eligibility. Veterans orientation to higher education. Research to Reconnect. Grant Criteria (pg. 7-8) Grantees will plan and implement a conference focusing on the transition and support strategies enhancing academic success of military, veterans, and associated dependents attending ...