MIDEX/MAP Project Reliability Program Overview and ...

MIDEX/MAP Project Reliability Program Overview and ...

NASA Reliability MAP Project Reliability Program Overview Michael Bay MAP System Engineer [email protected] Jackson and Tull Aerospace Engineering Division How FMEA, FTA, RBD and PRAs were used on MAP, and how they fit into overall mission success in the Faster Better Cheaper environment. 9 August 2000 w/ updates September 2000 Michael Bay Page 1 8/9/00 NASA Current Environment Reliability AO Requirements and Project Constraints Defines technical, cost and schedule boundries Faster Better Cheaper Encourages efficiency and increased productivity New Technology Infusion Microelectronics and new materials enables smaller, lighter, lower power and more capable missions Project Implementation ? How to build the appropriate end product, and how to execute the program within constraints and achieve Mission Success? Mission Success Michael Bay Page 2 8/9/00 NASA How to Do It? Reliability Understand: What makes a Mission Successful? Proper Execution of the Basics in Engineering and Project Management

Attention to Detail Appropriate Discipline and Rigor Test, Test, and Retest. Retest after changes. What makes a Mission Unsuccessful? Recent Failures have not been due to the unexpected behavior of a breakthrough new technology Recent Failures have been due to missing an important detail at more than one level of assembly or test. Most of the Recent Failures would not have been caught by classical FMEAs, FTAs or PRAs alone. However, it is the mind set, the systems thinking, and all the questions asked in the process that would surface the issues. The Devil is in the Details Expect the Unexpected Do not become complacent Beware of inappropriate application of Heritage and assumptions that do not fit Michael Bay Page 3 8/9/00 NASA How to do More with Less Reliability Do not forget the basics of what goes into a Successful Mission Understand the Risks in both the Flight/Ground Segments and Project Execution Important Distinction for Risk Management Flight/Ground Segments are the end products performing their desired function in their operational environment FMEA, FTA, RBD and PRAs are good tools here Project Execution is the ability to deliver the desired product meeting requirements, on time and within cost. Classical FMEA, FTA, RBD do not apply here, although the techniques could be applied PRAs are appropriate here Michael Bay Page 4 8/9/00 NASA Orchestrating a Balance Reliability Engineering Efficiency = Performance * Reliability Cost * Time Better Cheaper * Faster = Michael Bay Page 5 8/9/00 NASA

Risk Management Reliability Risk is the Uncertainty of Performance, Reliability, Cost or Schedule To quantify risk look at likelihood and consequence of an event Risk Management What Can Go Wrong? How Will We Know Something Has Gone Wrong? When Will We Know that Something Has Gone Wrong? What Will We Do About It? Expect the Unexpected so that The Unexpected Becomes Expected These Questions Asked Globally Every Day from Design through Manufacturing, Test and Operations will do the most to Assure Mission Success Recent Red Team Activities ask these types of questions Try to quantify Risk Attempt to identify if anything was missed in the basics Michael Bay Page 6 8/9/00 NASA MAP Observatory Reliability Map the Cosmic Microwave Background Radiation Follow on to COBE with 50 times the resolution Medium Size Explorer, MIDEX Operate at L2, Store and forward data every day 3 Axis, Scan Sky at 1 revolution every 2 minutes 840 Kg Estimate, Approx 3.6 meters tall, 5.1 Across 400 Watts, 72 Kg Fuel Microwave Anisotropy Probe Michael Bay Page 7 8/9/00 NASA MAP Reliability Philosophy Reliability Maximize Science Return for given Cost and Schedule AO Direction Due to cost limits systems predominantly nonredundant or single string Selective redundancy encouraged where resources allow Redundancy up to each mission and the PI Reliability Designed in from the Beginning MAP Assurance Requirements Cover Total Program: Design with proper parts application Grade 3 Parts program selected as best value for the MAP Program Workmanship Inspection program to NHB5300 or equivalent A Peer Reviewed, Simple and Robust Design providing graceful degradation Failure Modes and Effects Analysis, Fault Tree Analysis, Reliability Predictions

and Probabilistic Risk Assessment used to identify mission ending failures, designs adjusted where possible to shift effect from mission ending to degraded mission Test program accumulating significant mission specific test time Constant drive to identify and strengthen weak links to mission success Michael Bay Page 8 8/9/00 NASA Identify Weak Links Reliability Forces that Threaten Links Mission Success Moving Surfaces Mitigators Murphys Law: Anything that can go wrong, will Design Software and Operations Parts Application Random Failure Understanding of Environment Workmanship Inspection to NHB5300 or Equivalent, Material and Process Control Good Parts, Simplicity, Robust Design with Graceful Degradation, Redundancy Test Test Design Margin, Redundancy Integrated Software and Operational Plan Parts Stress Analysis to PPL21

Analysis with Single Common Environmental Spec Peer Review Peer Review Peer Review Peer Review Test Test Test Test Where is the weakest link? What will cause a link to break? Will the System hold with a broken link? M. Bay Michael Bay Page 9 8/9/00 NASA The Basics, Launch Readiness Flow Reliability Flight Readiness Flow 12/20/99 Mission Requirem ents Docum ent Tree Ground System Requirements S pacecr aft & Instrum ent Requirem ents Performance Assurance Requirem ents S cience Data Processing Requirements Launch Vehicle Requirem ents System Verification Specification Operations Concept Subsystem & Component Requirem ents Env ironm e nta l Verific a tion Ma trix Electrical System s Requirements Contract / Pr ocur ement Id e ntific a tion o f "Wha t is no t te s te d in Flig ht Con figura tion"

Delta Mission Specification Require m e nts Ve rifi c a tion Tra je c tory a nd Mane uv e rs Pla nnin g a nd Analy s is Flight Operations Plan Re lia bility, Fa ilu re Mode s a nd Effe c ts Ana ly s is Mis s ion Pla ning , Sc he duling , T ra je c to ry a nd Ma ne uv e r Pla nning Re quire m e n ts Verific a tion Flig ht Dy n a m ic s Ac tiv itie s a n d Inte rfa c e s Ve rife d Real t i meAt ti t ude, Cal i brat i on Tool s, Maneuver Model i ng Da ta Arc hiv e Sy ste m Sc ie nc e Pla n ning, Sc hedu ling Da ta Arc hiv e Sys te m Mis io n Unique Cha ng e s a n d F Irs t Flig ht Ite m s Re v i e we d Com pone nt Te s t Build Tes t Surfa c e a nd De e p Die le c tric Cha rging Sy s te m En v ironm e ntal Te st Vibra tion, EMI, Th e rm al Va c Ac c e pta nc e Te s t Ele c tric a l Parts Ra dia tion Ana ly s is CPT & F unc tio na ls Signe d Off ACS End t o End Phasi ng, S/ A t o S/ C El ect ri cal , RFAI r l i nk, Mechani sms & Depl oyment s Wa iv e rs a nd De via tion s Clos e d Tre n d Da ta Com pl ie d an d Ana ly z ed PFRs Clos e d Out Re d Book Approv e d Mis s ion Sim ula ti ons & CTV Ac c e ptable Trou bl e Fre e a nd Tota l Run T im e Performance Assurance Obser vatory Test Program Powe r Budge t, Sola r Arra y, Lo a d a nd Batte ry Ba la nc e

Wiring a nd Fu s ing Analy s is Com po ne nt Ac c epta nc e Re v ie w Clos e d Out T able s , TSM, RTS Ve rific a tio n Work Orde rs Close d Out Mis s io n Sim ul a tions Com pl e te As Bui lt v s As Des i gne d Confi gura tion Ve ri fic a tion Fligh t Softwa re DRs Clos e d Observatory Assembly Flight Softwar e Test Program Ele c tric a l Pa rts Stre s s, The rm a l Ana ly s is Struc tua l An a ly s is Ac c e pta nc e , De liv e ry, Pres h ip, a nd Re a d ine s s Re v ie ws PC Board, Box, Subsystem/ Syst em Tre n d Ana ly s is Sy s te m Com m a nd, Te le m e try, Proc e dure s , Pa ge Dis pla y, Lim its , Sy s te m s De finition a nd Tes tin g of Ea rly Orbit Tren d a nd Pe rform a nce Pa ra m e te rs Tre nd Ana lys is Sy s te m T ools to Build a nd Te st Onbo a rd Co m pute r TSM, ATS, RTS, T able s a nd Pa tc hes Contin ge nc y Proc e dure s Ve rifie d On Orbit & La unc h SIte Mis s i on Sim ula tion s Com ple te Te s t Re a din e s s Re v iews Subsystem Engineering Cons tra int, Re s tric tion, Ha z Cm d s , Lim its Ope ra tion s Sc rip ts a nd Proc e du re s Ve rifie d De s ign Re v ie ws The rm a l Ana ly s is PC Board, Box, Subsystem/ Syst em Constraints Docum ent Pe e r Re vie ws 1 . Re q uire m e nts

2 . De s ign 3 . Ops Conc e p t Sp e c ia l One Tim e Te s ts Ma te ria ls a n d Proc e s s e s Lis t App rov e d Observatory Operations Manual Inte rfa c e s & En d To En d Te s ti ng Com plete Da ta ba se Ve rifie d I nte rfa c e s & Da ta Th roughp ut Te s ting Com ple te Observatory to Launch Vehicle Interface Test Ope ra tions Pe rs onne l Certifie d, Suffic ie n t Tim e with Spa c e cra ft C&DH RF ACS Powe r Propu ls ion Ins tru m e nt De ploy m e n t Mec h a nic a l Ele c tric al Sy s te m s Analysis Requirements Verification Reviews S taffing & Training Com plete Networks and External Or ganizations Observatory Mission S umulations Science Data P rocesing Sim ulated Flight Mission Readiness Testing Operations & Ground System Ready for Launch Observatory Ready for Launch Control Center Acceptance Launch Vehicle Ready for Launch

Mission Ready for Launch M. B ay Michael Bay Page 10 8/9/00 NASA Reliability Analysis Flow Reliability Mitigate Risk, Change Design or Operations Concept Mission Requirements and Success Criteria Fault Tree Analysis (Top Down) Changes to Requirements Requirements Waviers, Deviations Mission Design and Operations Concept FMEA (Bottom Up) Update Analysis on Changes Changes to Design or Operations Concept Development Activities Risks Design, Manufacture, and Test Probabilistic Risk Assessment Reliability Block Diagram (Predictions) Problem Reporting (Risk Rated) Launch and Operations Michael Bay Page 11

8/9/00 NASA Reliability Improvement Approach Reliability Identify Weak Links Estimate failure rates for each subsystem element (component or card) Compute failure rates using MIL-HDBK-217 techniques Collect measured failure rates from flight, life test, or vendor data Average Failure Rates where multiple sources exist Evaluate effects/consequence of failure, revisit Failure Modes and Effects Analysis Evaluate possible mitigation approaches Total Redundancy Minimal point design hardware to augment existing system Back door paths to allow backup functions Compute System Reliability improvement for each mitigation approach Study resources necessary to implement each mitigation approach Mass, Power, Cost, Parts Availability, Schedule, ability to descope, Manpower Select mitigation approaches to maximize efficiency of total program Total System Reliability Improvement vs Required Resources Michael Bay Page 12 8/9/00 NASA Probabilistic Risk Assessment Reliability 1 >10% (<90%) 2 <10%>1% (>90% <99%) 3 <1% (>99%) 4 Failure Unlikely - No Known failure, Large Margins 5 Failure Not CredibleSpecific mitigation action 5 No Effect 4 Loss of non critical function a. not needed, b. backup available 3 Degraded Mission (meets minimum science mission) 2 Loss of Science (does not meet minimum science mission) 1 Total Loss of Mission Criticality - Consequence of Failure

Risk High Medium Low Michael Bay Page 13 8/9/00 NASA Reliability Process Overview Reliability Michael Bay Design FMEA to identify mission threatening failures from mission degradation Revise designs to convert loss of mission failures into loss of function or mission degradation Reliability failure rate analysis used to weigh the relative benefit of one design implementation versus another Verification of proper parts application in design Peer Review Process Manufacturing Workmanship Inspection to verify as built hardware meets designers intent Materials and Process Control Testing Verify as built hardware meets designers requirements in the intended application Sufficient Test time to find infant mortality failures Operations Onboard Fault Detection and Correction to safe spacecraft to provide ground time to react and potentially recover from an anomaly Operational Contingency Procedures and Backup Plans for mission critical and recoverable failures Reliability Philosophy and MAP Mission Assurance Requirements communicated to MAP Hardware Suppliers (Very important to assure a supplier is not a weak link) Page 14 8/9/00 NASA Reliability Reliability Process Design and Analysis Phase 1. Perform System Level FMEA and FTA to determine failures that result in mission loss versus mission degradation 2. Adjust design or implementation such that failures categorized as mission loss are moved to the degraded mission category. The overall goal is to reduce the number of potential mission failures. 3. Reliability failure rate analysis and Reliability Block Diagrams are used to weigh the relative benefit of one design implementation versus another. 4. Where failures result in graceful degradation and require rapid ground intervention or changes in operational plans to save the spacecraft, prepare contingency procedures or software loads to implement them. 5. Critically review the design of the spacecraft power bus. A short on the primary power bus can take out the whole spacecraft. The design of the power bus is such that shorts are considered not credible by design. 6. Peer Review process for both Hardware and Software to identify potential

design and/or implementation problems. Michael Bay Page 15 8/9/00 NASA Reliability Design and Analysis Phase Failure Modes and Effects Analysis Reliability and Failure Modes and Effects Analysis have different goals for redundant and single string spacecraft. As a single string spacecraft MAP strives to minimize the effects of a failure whereas a redundant spacecraft strives to avoid single point failures. For a single string mission, large number of faults can result in mission loss. However, there are also many failures that may result in partial loss of function or in a reduction in performance. These type of failures result in graceful degradation. Look at interfaces and down to the circuit level. A redundant spacecraft design focuses primarily on preventing single point failures and focuses less on designing in graceful degradation. Usually stops at interfaces to assure faults do not propagate to redundant unit. For MAP designing in graceful degradation is much more important since there are minimal redundant units available for backup. The FMEA is synchronized with the Fault Tree at the major component functional level (i.e. Transponder Receiver, ACE Safehold, PSE Load Switching) Michael Bay Page 16 8/9/00 NASA Integrated Mission Fault Tree Reliability 3/28/00 Loss of Miss ion Loss of D ata / C ommunications Loss of Power 1. Inability to Point to Scienc e Target 2. Large Surfac e ESD c auses C ata strophic Failure 3. Pointing of Science A x is t o D etector Threat Batter y Underchar ge Low Voltage Loss of Batter y due to Sever e Over char ge Spacecraft in Imprope r O rbit Over / U nder Temperature Loss of Instrument Instr um ents / Spacecr aft: Violation of Sun Constraint for TDB minutes, TBD components

Loss of Power : Power Feed or Converters Loss of Data: Detector to C&DH Data Flow Maneuver Execution Er r or Solar Ar r ay Deploym ent Failur e PS E Contr ol Cir cuit Failur e Com ponent Failur e Full Passive Redundant Loss of Attitude Contr ol Solar Ar r ay Failur e Loss of Tem per atur e Control Solar Ar r ay Cells & Str ings (Full Active Redundant) Thr uster Stuck On/Off Missed, Aborted, Over bur n Maneuver Power Conver ter s: ( Single String) AEU - DEU - 1773: (Singl e String) Deploym ent Electr onics (Full Passive Redundant) PSE DC/DC Failur e Gr ound Based Maneuver Planning or Modeling Er r or Loss of Power to Mission Cr itical Com ponents Mechanical Hangup Tracking or Navigation Modeling Er r or PSE Cir cuit Failur e (Graceful Degradation) Attitude Contr ol Electr onics (Full Passive Redundant) Pr opulsion Contr ol (Full Passive Redundant) Power Distr ibution Fault CDH Contingency RF-DNLK RF-UPLK MV Reset MV Failure HRSN Reset Maneuver ACS Contingency Deploy ACS Contingency ACS-TIPOFF ACS-ACE

ACS-MANEUVER ACS-EPH ACS-KF ACS-AST ACS-RWA ACS-CSS ACS-IRU Power Contingency PSE-BAT-1 PSE-OM-1 PSE-LVPC-1 PSE-SA-1 PSE-RSN-1 Am ps to AEU Car ds: (Grac eful Degradation) Ther m al Contr ol (Graceful Degradation) Contingency PROP-TMP-1 PROP-TMP-2 PROP-TMP-3 A4.1. A4.2. A4.3. A4.4. A4.5. A4.6. PWR-A1 Batter y Outside Lim its A1.1 Low Battery Capacity & Pres A1.2 Low Battery Voltage A1.3 High Differential Voltage A1.4 High Battery Temperature ACE Powered Off Wheels Powered Off Mongoose Powered Off EVD Off During Maneuver Transmitter Powered Off Survival Heater Powered Off Instr um ent Contingency Instr um ent Bias Instr um ent Science Pr op-A1 Contingency Pr ocedur es A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. DEPLY-SEP DEPLY-ACE DEPLY-SAD PWR-A4 Power DIstr ibution Contingency Maneuver Planning Overall Flow Maneuver during station contacts Thruster Failure (planning) Missing all prigee burns Split perigee maneuvers LV injection orbit error >3s PROP-THR-1

PROP-PRES-1 PROP-ISO-1 ACS Contingency ACS-A4 System Level ACS FDC A4.1. System Momentum Check (Delta V, Delta H, RWA) A4.2. Sun on Array Check A4.3. Sun Constraint Check A4.4. Kalman Filter A4.5. S/C Rates (Gyro, DSS, AST) A4.6. S/C Positn (Gyro, DSS, CSS) A4.7. Ephemeris Ther m -A1 Tem per atur e Lim it A1.1. Battery A1.2. Propulsion Components A1.3. Heater On/Off A1.4. Instrument Overtemp Heater out of Configuration Pressure out of Limits Latch Valve Closed EVD out of Configuration Thruster Current Catbed Temperatures Pr op-A1 Configur ation Pr oblem A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. Heater out of Configuration Pressure out of Limits Latch Valve Closed EVD out of Configuration Thruster Current Catbed Temperatures ACS-A3 ACS Contr oller Pr oblem CDH-A1 Com puter Failur e A1.1 1773 Bus Data Not Received A1.2 Memory Checksum Error A1.3 Lost Contact with RSN PWR-A2 Char ging System Faults A2.1. A2.2. A2.3. A2.4. A2.5. Undercharge - Amp Hr Control Overcharge - PSE Control Loop / Shunt ESN Watchdog System Configuration PSE RT Failure ACS-A6 S/A Deploym ent Failur e A6.1.Pots do not indicate Deployed A3.1. Sun Acquisition A3.2. Inertial (includes Slew) A3.3. Observing A3.4. Delta V (Control & Time A3.5. Delta H Unload (Control & Time) White Box - Failure Propagation Red Colored Box - Single Point Failures Yellow Colored Box - Graceful Failures Green Colored Box - Redundancy Failures

Yellow Outline Box - Ground Contingency Procedure Blue Outline Box - Onboard FDC ACS-A2 ACE or Safehold Failur e CDH-A2 Backup Deploym ent A1.1 Wheels, Gyro, Transmitter On A1.2 S/A Deployment A2.1. A2.2. A2.3. A2.4. A2.5. A/D not Ready ACE in Safehold Invalid Data Packet LVPC Config, On/Off RWA Powered Off ACS-A1 ACS Sensor / Actuator Failur e A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. A1.7. Michael Bay Inertial Reference Unit Reaction Wheels DSS Solar Array Deploy Pots AST 1 AST 2 CSS Page 17 8/9/00 NASA Reliability Design and Analysis Phase Fault Tree Analysis Fault Tree Starts with Loss of Mission as the top block. Key to this Top Block is Understanding what defines a mission loss, Mission Success Criteria Knowing the Design of the System and how it will be operated, Postulate the faults that could result in loss of mission. Faults are logically combined and further decomposed until the lowest desired level is reached. Lowest level should overlap and be consistent with the FMEA. Typically the component major function level. (i.e. Transponder Receiver, ACE Safehold, PSE Load Switching) Requirements for Contingency procedure and Onboard Autonomous Switching (Fault Detection and Correction) should be included to show where action is required. The Fault Tree provides a graphical format for organizing postulated failures, understanding their consequence on the system, and understanding their relationship to other systems and subsystems Michael Bay Page 18 8/9/00 NASA

MAP Reliability Block Diagram Reliability Michael Bay Page 19 8/9/00 NASA Reliability Design and Analysis Phase Reliability Block Diagram Uncertainty in the absolute number of a total mission reliability prediction. Large error bars. Relative comparison between approaches or implementation are fairly good Indicates the relative improvement of redundancy Comparisons allow selection of more reliable solutions Computations based on Schematics and MIL-HDBK-217 Some historical data available from operations database Averaging of computations and historical data possible Michael Bay Page 20 8/9/00 NASA Reliability Improvement Study Results Reliability Reliability Improvement Location 1 Flight Software to allow 2 out of 3 Wheels 2 2 More Thrusters, Flight Software to allow Thruster Backup Same bracket as Radials 3 2nd Transponder & XRSN 4 5 2nd Star Tracker Under Top Deck near PSE "Little MAC" - 2nd Mongoose, LVPC & Minimal ACE RSN, 2nd set of 6 CSS Eyes Beside MAC on same panel 6 Flight S/W to allow Star Tracker to backup Gyros Mongoose ACS Software

XRSN in MAC, Xpndr beside existing on bottom deck Mongoose ACS Software Performance Minimum Science, Reduced Control & Acquisition Minimum Science, Additional Fuel Usage Future Descope (Mass, Pwr) 7 8 9 Bottom Deck equally spaced, 4th Wheel or on Bottom Deck TBD PSE Linear Regulators and Assurance Output Modules Remain On 2nd side of existing boards 10 DEU with 20/20 split Power Cost Schedule 70% 0 0 Med Low No 30% 1.2 4 Watts Heater Low Low 100% Yes 11% 6.5 6 Watts, 0 Heater Med Low

100% Yes 13% 7 8-12 Watts Med Low Yes 14% 6.5 0 Watts Med Low Low Low Minimal Function, Minimum Science No Backup for Acquisition, 100% Mission N/A Reliability improvement of selected options Bottom Deck, PDU Under Top Deck w/4th Wheel Mass N/A Subtotal 2nd AEU & PDU 20/20 Configuration, DEU Single String System Reliability Improvement 9% 0 0 249%* 21.2 22 Watts Comments S/W change can be delayed until failure 249%*

No 7% 10.3 8.5 Watts Med High 3 mos for windings, No Actels 100% Yes 13% 4th wheel w/ S/W; 83% 4th wheel inc S/W (2 of4); 57% w/o S/W(3 of 4) 17 10-15 Watts Heater Med Med Large Mechanical System Impact 100% No 2% 0.2 0 watts Med Med Long Lead Parts Loss of 1/2 of Instrument No 11% High High Parts Availability, Actels Loss of 1/2 of Instrument Subtotal 108% or 78% * Reliability improvement potential of all options

27.5 23 326% * * The percentage improvement isnot a summation of the individual option improvement percentages. Michael Bay Page 21 8/9/00 NASA Representative FMEA/PRA Summary Reliability Subsystem/ Component Failure Modes/Description ACS:Propulsion tank, fuel Fuel leak due to fuel line connection line, F/D valve or tank failure Mitigation/Rationales for Acceptance Crit. Prob. 1) Inspection for welded joint during manufacturing and proof pressure test, 2) Prop system leak check during I&T, 3) Control fuel leak by closing isolation valve if leak is present downstream of the isolation valve 1 4 or 5 2 <10%>1% (>90% <99%) 3 <1% (>99%) PWR:Battery, and PSE internal Power Bus Unprotected Power bus short to ground or chassis - loss of mission Sound design, inspection, component screening during hardware development 1 4 or 5 PWR:PSE LVPC Power System converter failure or filter short to ground - loss of PSE functionality and mission Low probability failure mode. Precap visual inspection & sufficient ground testing 1

3 PWR:Battery Current Sensor Open Loss of Battery return reference due to open current sensor- loss of mission Bus capacitor short may result in unregulated power bus short to chassis if the short is not cleared Loss of instrument power due to switch failure - loss of total science Battery current sensor open is unlikely failure mode due to large design margin of shunt. Low probability failure mode (short will most likely be fused). Sufficient ground testing. Low probability failure mode. Precap visual inspection & sufficient ground testing 1 4 1 4 2 3 Large design margin, environmental -test, and inspection program 1 5 Design heritage of converter and ground test program 2 2 PWR:Bus capacitor PWR:Output Module #1 Instrument SSPC C&DH:Deploy Mechanism Deployment actuator failure results in premature solar array deployment loss of mission Instrument:AEU Instrument DC/DC converter failure result in loss of power to instrument - loss of total science Michael Bay 1 >10% (<90%) 4 Failure Unlikely - No Known failure, Large Margins 5 Failure Not CredibleSpecific

mitigation action 5 No Effect 4 Loss of non critical function a. not needed, b. backup available 3 Degraded Mission (meets minimum science mission) 2 Loss of Science (does not meet minimum science mission) 1 Total Loss of Mission Criticality - Consequence of Failure Risk High Medium Low Page 22 8/9/00 NASA PRA, Graphical Tree Format Reliability 3/28/00 Loss of Miss ion Loss of D ata / C ommunications Loss of Power 1. Inability to Point to Scienc e Target 2. Large Surfac e ESD c auses C ata strophic Failure 3. Pointing of Science A x is t o D etector Threat Batter y Underchar ge Low Voltage Loss of Batter y due to Sever e Over char ge Spacecraft in Imprope r O rbit Over / U nder Temperature Loss of Instrument Instr um ents / Spacecr aft: Violation of Sun Constraint for TDB minutes, TBD components Loss of Power : Power Feed or Converters Loss of Data: Detector to C&DH Data Flow

Maneuver Execution Er r or Solar Ar r ay Deploym ent Failur e PS E Contr ol Cir cuit Failur e Com ponent Failur e Full Passive Redundant Loss of Attitude Contr ol Solar Ar r ay Failur e Loss of Tem per atur e Control Solar Ar r ay Cells & Str ings (Full Active Redundant) Thr uster Stuck On/Off Missed, Aborted, Over bur n Maneuver Power Conver ter s: ( Single String) AEU - DEU - 1773: (Singl e String) Deploym ent Electr onics (Full Passive Redundant) PSE DC/DC Failur e Gr ound Based Maneuver Planning or Modeling Er r or Loss of Power to Mission Cr itical Com ponents Mechanical Hangup Tracking or Navigation Modeling Er r or PSE Cir cuit Failur e (Graceful Degradation) Attitude Contr ol Electr onics (Full Passive Redundant) Pr opulsion Contr ol (Full Passive Redundant) Power Distr ibution Fault CDH Contingency RF-DNLK RF-UPLK MV Reset MV Failure HRSN Reset Maneuver ACS Contingency Deploy ACS Contingency ACS-TIPOFF ACS-ACE ACS-MANEUVER ACS-EPH ACS-KF ACS-AST ACS-RWA

ACS-CSS ACS-IRU Power Contingency PSE-BAT-1 PSE-OM-1 PSE-LVPC-1 PSE-SA-1 PSE-RSN-1 Contingency PROP-TMP-1 PROP-TMP-2 PROP-TMP-3 A4.1. A4.2. A4.3. A4.4. A4.5. A4.6. PWR-A1 Batter y Outside Lim its A1.1 Low Battery Capacity & Pres A1.2 Low Battery Voltage A1.3 High Differential Voltage A1.4 High Battery Temperature ACE Powered Off Wheels Powered Off Mongoose Powered Off EVD Off During Maneuver Transmitter Powered Off Survival Heater Powered Off PWR-A2 Char ging System Faults A2.1. A2.2. A2.3. A2.4. A2.5. Undercharge - Amp Hr Control Overcharge - PSE Control Loop / Shunt ESN Watchdog System Configuration PSE RT Failure ACS-A6 S/A Deploym ent Failur e A6.1.Pots do not indicate Deployed A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. ACS-A4 System Level ACS FDC A4.1. System Momentum Check (Delta V, Delta H, RWA) A4.2. Sun on Array Check A4.3. Sun Constraint Check A4.4. Kalman Filter A4.5. S/C Rates (Gyro, DSS, AST) A4.6. S/C Positn (Gyro, DSS, CSS) A4.7. Ephemeris Ther m -A1 Tem per atur e Lim it A1.1. Battery A1.2. Propulsion Components A1.3. Heater On/Off A1.4. Instrument Overtemp

A3.1. A3.2. A3.3. A3.4. A3.5. A1.1 Wheels, Gyro, Transmitter On A1.2 S/A Deployment A2.1. A2.2. A2.3. A2.4. A2.5. A/D not Ready ACE in Safehold Invalid Data Packet LVPC Config, On/Off RWA Powered Off ACS-A1 ACS Sensor / Actuator Failur e A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. A1.7. Michael Bay Heater out of Configuration Pressure out of Limits Latch Valve Closed EVD out of Configuration Thruster Current Catbed Temperatures Pr op-A1 Configur ation Pr oblem A1.1. A1.2. A1.3. A1.4. A1.5. A1.6. Heater out of Configuration Pressure out of Limits Latch Valve Closed EVD out of Configuration Thruster Current Catbed Temperatures White Box - Failure Propagation Red Colored Box - High Risk Failure Yellow Colored Box - Medium Risk Failure Green Colored Box - Low Risk Failures Yellow Outline Box - Ground Contingency Procedure Blue Outline Box - Onboard FDC Sun Acquisition Inertial (includes Slew) Observing Delta V (Control & Time Delta H Unload (Control & Time) ACS-A2 ACE or Safehold Failur e CDH-A2 Backup Deploym ent Instr um ent Contingency Instr um ent Bias Instr um ent Science

Pr op-A1 Contingency Pr ocedur es ACS-A3 ACS Contr oller Pr oblem A1.1 1773 Bus Data Not Received A1.2 Memory Checksum Error A1.3 Lost Contact with RSN Contingency Maneuver Planning Overall Flow Maneuver during station contacts Thruster Failure (planning) Missing all prigee burns Split perigee maneuvers LV injection orbit error >3s PROP-THR-1 PROP-PRES-1 PROP-ISO-1 ACS Contingency DEPLY-SEP DEPLY-ACE DEPLY-SAD PWR-A4 Power DIstr ibution CDH-A1 Com puter Failur e Am ps to AEU Car ds: (Grac eful Degradation) Ther m al Contr ol (Graceful Degradation) Inertial Reference Unit Reaction Wheels DSS Solar Array Deploy Pots AST 1 AST 2 CSS Page 23 8/9/00 Design and Analysis Phase NASA Reliability New Technology and Mission Success Select Approach to Mitigate New Technology Risks Risks to Technical Performance in End Item Application Risks to Project Execution Use Risk Management Techniques to weigh benefit of new technology versus the consequence of it not being ready or not working. Mitigation Steps Michael Bay Test working hardware/software as soon as possible Early verification through Engineering Test Units (ETUs)

Define Alternate or Backup Sources Descope Plan - Prepare to scale back to minimum mission requirements Page 24 8/9/00 NASA Reliability Reliability Process (cont.) Manufacturing and Inspection Phase 1. Failures are viewed as mechanical. Whenever an item fails it usually means that something moved, whether internal to a chip, on a circuit card or in harness. If it worked once and then does not, something moved. (EMI is the exception.) 2. Stress relief against vibration, mechanical motion, and thermal expansion. 3. Clearance to protect against shorts. Close inspection as lower level sub assemblies are assembled. 4. The power system electronics are carefully inspected during assembly to screen for potential shorts. Shorts on the power bus are considered not credible following inspection. 5. Eliminate sources and provide barriers to contamination that could cause shorts or degrade the surface properties of instruments or thermal control surfaces 6. Walkdowns and Inspections for critical items dependant on workmanship, RF Shields and grounding for ESD protection are examples. 7. Manufacturing process control and inspection are as important as they are on a redundant spacecraft. Manufacturing process control may even be more important because there is only one chance to get it right. Michael Bay Page 25 8/9/00 NASA Reliability Reliability Process (cont.) Test Phase 1. Accumulate sufficient test time to gain confidence infant mortality failure period has passed. Goal is on the order of 1000 hours total with last 100 failure free. 2. Test and or execute the sequences planned for the mission. Perform steps and send commands in the expected sequence with the expected timing 3. Command sequences are verified prior to first time execution onorbit. If a sequence is performed onorbit for the first time, analysis should exist that indicates the item will work. Items are tested in pieces or in steps instead of relying on analysis alone. 4. Critically test flight and ground software against requirements as well as the intended end item function independent of the requirements. 5. Exercise the hardware and software together during environmental test in the modes they are operated during the mission. 6. Specifically seek out What is not Tested in Flight Configuration. Review assumptions made in verification program especially where verification is accomplished in pieces or by simulation. Michael Bay Page 26 8/9/00 NASA Basics, What is not Tested? Reliability Identify items that can not be test in the flight configuration and environment Assure that simulations and assumptions are appropriate Typical areas applicable to most projects

Michael Bay End-to-end Instrument Optical or RF check at flight temperature(Tested in Pieces) Loaded Propulsion System and Thruster firings (Component Test) Solar Array deployment in zero G, vacuum, and temperature with gradients. (tested in pieces) Power System working with illuminated solar arrays (Verified against simulator with sufficient margin) ACS Operating in closed loop with flight hardware and flight software (Hardware tested with stimulators open loop, software verified with HDS closed loop, sensor actuator end to end phasing verification) Launch, ascent, separation, and acquisition sequence with the correct timing of external environmental events. Vibration, Thermal, Vacuum, Solar, etc. (Verified in pieces) Inability to test radiation (SEU in particular) environment (Parts testing based on design & engineering judgement) Inability to test surface or internal charging environment (Materials usage and testing based on design & engineering judgement) Page 27 8/9/00 NASA Reliability Reliability Process (cont.) Operations Phase 1. Utilize a simple subset of the total Spacecraft electronics suite to provide an ACS Safehold that allows additional time for the ground to recover from an anomaly 2. Onboard failure detection to minimize the impact of mission threatening anomalies 3. Spacecraft informs ground of serious off nominal situations 4. Contingency procedures prepared for critical subsystems and mission events 5. Ground system capable of identifying adverse trends and/or off nominal performance 6. Training and exercising of the flight and ground systems during prelaunch mission simulations 7. Separation/Deployment and Propulsion Maneuvers performed within ground contact Michael Bay Page 28 8/9/00 NASA Summary Reliability Overall Reliability process addresses total program lifecycle including: Design, Manufacturing, Test and Operations phases Reliability built in from the beginning FMEA, FTA, RBD, and PRA used as tools in an overall Reliability Assurance Program to optimize the architecture and design The PRA is maintained and updated with test results and other changes throughout the project life cycle Failure mitigators address Moving Parts, Parts Application, Environments, Software/Operations, Workmanship, and Random failure

causes As part of the total reliability program MAP has implemented designs that provide graceful degradation and backups in selective areas MAP has achieved a balance of Performance, Reliability, Cost and Schedule within the available resources Michael Bay Page 29 8/9/00 NASA Acronym List Reliability ACS ACE AEU CSS DEU FBC FMEA FTA LVPC MAC MAP PDU PRA PSE RBD XRSN Michael Bay Attitude Control System Attitude Control Electronics Analog Electronics Unit (Part of Instrument Electronics) Coarse Sun Sensors Digital Electronics Unit (Part of Instrument Electronics) Faster, Better, Cheaper Failure Modes and Effects Analysis Fault Tree Analysis Low Voltage Power Converter MIDEX Attitude and C&DH Microwave Anisotropy Probe Power Distribution Unit (Part of Instrument Electronics) Probabilistic Risk Assessment Power System Electronics Reliability Block Diagram Transponder Remote Services Node Page 30 8/9/00

Recently Viewed Presentations

  • HOW I LEARN - Sam Houston State University

    HOW I LEARN - Sam Houston State University

    with hocked gems financing him/ our hero bravely defied all scornful laughter/ that tried to prevent his scheme/ your eyes deceive/ he had said/ an egg/ not a table/ correctly typifies this unexplored planet/ now three sturdy sisters sought proof/...
  • Single-balanced mixer design Design review

    Single-balanced mixer design Design review

    Diodes. AvagoSchottky series pair diodes selected. ... 1-oz copper used due to soldering, particularly the connectors - fear that 0.5 oz would tear or lift during hand-solder operations. AWR Model. ... Single-balanced mixer design Design review
  • El profesor explica y hace preguntas en clase con el apoyo de ...

    El profesor explica y hace preguntas en clase con el apoyo de ...

    Presentamos las excursiones realizadas con las familias en el fin de semana. (movimaker, kizoa, picturetrail,photopeach) Interpretar el planos sencillos. ( google maps) Crear lineas de tiempo. (Timerine, Dipity) Competencia digital.
  • #22 Food Chains, Webs, and Trophic Levels

    #22 Food Chains, Webs, and Trophic Levels

    *Food chain. Trophic levels use some of the energy in the process of cellular respiration… C. 6 H 12 O 6 + 6 O 2 → 6 CO 2 + 6 H 2. O + heat energy. Each level loses...
  • Natural Sciences and Technology Grade 6 Term 2:

    Natural Sciences and Technology Grade 6 Term 2:

    Soluble substances in water. Gr 6 Natural Sciences and Technology - Term 2, Topic 5. Water is a good solvent. Water dissolves most things that it comes into contact with, including poisonous substances like fertilizers.
  • An Absurdly Short History of Christianity Part 5

    An Absurdly Short History of Christianity Part 5

    First signs of official public role. Crusades have mixed results. East/West unity disrupted. Church practices draw criticism. Politicians resist Church influence. Protest leads to schism. All sides work on doctrine. First signs of pluralism. State starts to get the upper...
  • Real-time Video Effects Using Programmable Graphics Cards

    Real-time Video Effects Using Programmable Graphics Cards

    Real-time Video Effects Using Programmable Graphics Cards Master of Science Thesis Klas Skogmar [email protected] Introduction Graphics cards have much computing power but are only used by 3D applications Video and image editing programs often needs to perform per pixel operations...
  • Application Season!

    Application Season!

    Make sure your Naviance login is working- you'll be using it a lot! Make your brag sheet on Naviance. Go to the "About Me" tab, and click on "South High Brag Sheet" Narrow down your list of colleges. Anywhere from...