当前位置:网站首页>Fault, error, failure of functional safety
Fault, error, failure of functional safety
2022-07-06 06:03:00 【Zhan Miao】
Some concepts in functional security are more convoluted , For example, failure (fault), error (error), invalid (failure), This paper discusses these three concepts .
1. fault
Failure defined in functional safety refers to abnormal conditions that can cause failure of elements or related items .
Faults can be divided into permanent faults and non permanent faults , The classification is shown in the figure below .
Permanent failure It refers to the occurrence and continuation of , Until the fault is removed or repaired . In other words, if a permanent fault occurs, corresponding measures must be taken to restore its normal operation . Systematic faults are generally permanent faults .
Non permanent failure It can be divided into intermittent faults and transient faults . Intermittent fault refers to the repeated occurrence of fault , And then disappear . When a component is on the edge of damage , Or, for example, due to switching surges ( The transient voltage changes violently ), Intermittent faults may occur . Some systemic failures ( For example, the timing is chaotic ) It may also cause intermittent problems .
Transient fault It refers to the fault that occurs once and then disappears . Transient faults can be caused by electromagnetic interference , This can cause bits to flip . For example, due to the single event flip effect (SEU) And single particle transient pulse (SET) Soft error occurred , All are transient faults .( Single event upset (SEU) is a process in which a single high-energy particle in the universe enters the sensitive region of a semiconductor device , The phenomenon of reversing the logical state of a device .)
2. error
ISO 26262 The error defined in refers to the calculated 、 Observed 、 The measured value or condition is different from the real one 、 Stipulated 、 The difference between or theoretically correct . Errors can be caused by unforeseen operating conditions or by the system under consideration 、 Internal failure of a subsystem or component . Faults can be expressed as errors in the considered elements , This error can eventually lead to failure .
For example, a single high-energy particle in the universe enters the sensitive region of semiconductor devices , The single event flip effect that flips the logical state of the memory SEU, Make one of the software bit From 0 To 1 Or from 1 become 0 It belongs to a soft error ( The hardware is not damaged ).
It can be seen from the above that the fault , The general relationship between error and failure is that failure can cause errors , Errors lead to failure . More details will be given below .
3. invalid
invalid , according to ISO26262 Is defined as the termination of an element's ability to perform functions as required .
( english :terminationof the ability of an element to perform a function as required)
notes : Incorrect specifications are one of the sources of failure .
Failure here refers to the loss or termination of function . For example, for motor controller , One of its main functions is based on the vehicle controller VCU Torque request , Control the torque and speed of the motor , Therefore, whether the output torque is unexpectedly large or small is a failure .
3.1. Systematic failure and random hardware failure
In functional safety, according to the cause of failure, it can be divided into two kinds : Systematic failure and random hardware failure .ISO 26262 The main purpose of is to eliminate these two kinds of failures as much as possible .
(1) Systematic failure (systematic failure)
Failure related to a cause in a definite way , Only for the design or production process 、 Operating procedures 、 Such invalidity can be eliminated only after the document or other relevant factors are changed .
Systematic failure has three characteristics :
A- Only carry out correct maintenance without modification , Failure cannot be eliminated .
B- By simulating the cause of failure, it can be repeated .
C- It's human error , Failure causes such as : Errors in the specification of safety requirements ; The design of hardware , manufacture , install , Operation error ; Software design and implementation errors, etc .
Software failures and some hardware failures are systemic failures . such as coding I didn't consider the error of using data type , A variable ( For example, the precision is 1,offset by 0) Should have used U16 Of , It turned out to be U8, So that the maximum value of the calculation can only reach 255. The software here bug It belongs to systematic failure .
(2) Random hardware failure (random hardware failure)
according to ISO 26262 The definition of , Random hardware failure is in the life cycle of hardware elements , Failure that occurs unexpectedly and obeys the probability distribution . And it can be predicted within a reasonable range of accuracy .
The meaning of unexpected occurrence is that although the hardware design is correct , For example, the selection of electronic components , Resistance value , Capacitance value , The circuit design is correct , And the device meets the quality standard . But I can't predict where it will happen , In what form does the failure occur .
Obeying probability distribution means that failure can be predicted within a reasonable range of accuracy . For example, the failure rate is obtained through reliability or analysis .
The cause of random hardware failure is due to physical processes , For example, fatigue 、 Physical degradation or environmental stress . For example, the bit flip mentioned above , For example, the open circuit of resistance , A short circuit , Resistance drift and so on .
3.2. Related failure and non related failure
In addition, related failure and non related failure are also defined in functional safety .
Related failure means that the probability of failure occurring simultaneously or successively cannot be expressed as a simple product of the unconditional probability of each failure . For example, when it fails A And failure B The probability of simultaneous occurrence is not equal to the opportunity of two failure probabilities , Expressed as Pab =Pa*Pb, invalid A and B Can be defined as a related failure . Conversely, uncorrelated failures can be expressed as a simple product of the unconditional probability of each failure .
Related failures can be divided into common cause failures and cascading failures .
CCF refers to the related items , The failure of two or more elements caused by a single specific event or source . As shown in the figure below .
CCF can be avoided through diversified program and hardware design .
Cascading failure refers to , The failure of one element causes the failure of another or more elements .
For example, software partitioning can avoid cascading failures . In the process of practical application level1 and level2 The variables in are stored in different RAM District or NVRAM Zone is a way of zoning .
4. Failure type of hardware
Hardware faults can be divided into the following types according to the fault type , As shown in the figure below :
(1) Safety failure :
Safety failure means that the occurrence of a failure will not significantly increase the probability of violating safety objectives (ISO 26262). Safety faults can be divided into two categories :a) Faults unrelated to the violation of safety objectives .b) n > 2 All n Point of failure ( Unless security concepts show that they are related to the violation of security objectives ).
Example 1: To be EDC And cyclic redundancy check (CRC) Protected flash : By EDC Corrected unit faults are not indicated by signals . The failure's violation of the safety goal has been EDC The prevention of , But it is not indicated by the signal . If EDC Logic failure , The fault is CRC Detected , The system is shut down . Only when there is a unit fault in the flash memory 、EDC Logic failure 、 And CRC When checksum monitoring fails , In order to violate the safety goal (n=3).
(2) A single point of failure :
Single point of failure means that it is not covered by the security mechanism , And directly lead to the violation of safety objectives (ISO26262).
For example, electric cars REESS( Rechargeable power storage system ) Single point failure of insulation resistance . Insulation resistance refers to B Level voltage ( Generally greater than 60V High voltage ) Resistance between terminals of live parts and electrical chassis . When the electric vehicle insulation material is aging and damaged , Water enters the battery system of car washing in rainy days , Vehicle collision, etc , Will lead to the reduction of insulation resistance and electric shock . normal Ri>100Ω/V. The reduction of edge resistance can directly lead to the risk of electric shock , So this is a single point of failure .
(3) Residual faults :
Residual faults are those that occur in hardware elements , The part of the fault that is not covered by the security mechanism (ISO26262) The occurrence of residual faults will directly lead to the violation of safety objectives . such as : If a failure mode is declared to be low, the coverage is 60%, So the rest 40% It is residual fault .
ISO 26262 An example is mentioned in part 10 : If you only use a chessboard RAM Test the security mechanism to check ram (RAM) modular , Then some kinds of bridge faults cannot be detected . The violation of safety objectives caused by these failures cannot be prevented by the safety mechanism . These faults are residual faults .
(4) Multiple faults :
Multipoint fault refers to a single fault that combines with other independent faults and leads to a multipoint failure (ISO26262).
notes : A multipoint fault can be identified only after multipoint failure is identified , For example, through the fault tree FTA Analysis of (ISO 26262). The two-point fault is the fault that two independent faults occur at the same time, which will lead to failure .
(5) Latent fault :
Latent failure means that the safety mechanism does not detect , And the multi-point fault that cannot be detected by the driver within the time interval of multi-point fault detection (ISO26262)
It can be understood as : A multi-point fault that cannot be detected and detected by the driver within a certain period of time is called a latent fault , such as :
A- Failure of monitoring chip
B- The failure of the security mechanism itself , But there is no problem with its function .
The latent fault is a multipoint fault , Combined with other independent multipoint faults, it will directly lead to the violation of safety objectives .
(6) Detectable faults (detected fault)
Detectable fault refers to that within a specified time , A fault detected by a safety mechanism that prevents the fault from becoming a latent fault .
Example : Special security mechanisms that can be defined in the functional security concept ( for example , Detect the error and inform the driver through the alarm device on the instrument panel ) Detected fault .
(7) Perceptible failure (perceivedfault)
The fault inferred by the driver within the specified time interval . Example : Faults can be directly perceived through obvious system performance or performance limitations .
Perceivable means being perceived by the driver , Whether or not a security mechanism detects , But its occurrence will obviously affect the function .
5. fault , The relationship between errors and failures
fault , The relationship between errors and failures is shown in the following figure . In the figure, there are three different types of reasons ( Systematic software problems 、 Random hardware problems and systematic hardware problems ) It describes the development process from fault to error and from error to failure .
Systematic failures are caused by design and specification problems ; Software failures and some hardware failures are systematic .
Random hardware failures are caused by physical processes , For example, fatigue 、 Physical degradation or environmental stress .
At the component level , Each different type of failure will lead to different failures . However , Failure at the component level is a failure at the relevant item level .
Be careful , In this example , Faults caused by different reasons at the vehicle level can cause the same failure . If additional environmental factors make the failure superimpose the accident scenario , Partial failure at the level of related items will be a hazard Hazard.
边栏推荐
- A complete collection of necessary learning websites for office programmers
- 多线程应用的测试与调试
- [email protected] raspberry pie
- 进程和线程
- 实践分享:如何安全快速地从 Centos迁移到openEuler
- Configuring OSPF GR features for Huawei devices
- Investment strategy discussion and market scale prediction report of China's solid state high power amplifier industry from 2022 to 2028
- Web服务连接器:Servlet
- GTSAM中李群的运用
- The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
猜你喜欢
Station B Liu Erden linear regression pytoch
授予渔,从0开始搭建一个自己想要的网页
First knowledge database
The usage and difference between strlen and sizeof
功能安全之故障(fault),错误(error),失效(failure)
Web service connector: Servlet
[web security] nodejs prototype chain pollution analysis
SQLMAP使用教程(三)实战技巧二
Download, install and use NVM of node, and related use of node and NRM
进程和线程
随机推荐
查詢生產訂單中某個(些)工作中心對應的標准文本碼
Mysql database master-slave cluster construction
Report on market depth analysis and future trend prediction of China's arsenic trioxide industry from 2022 to 2028
Zoom through the mouse wheel
Accélération de la lecture vidéo de l'entreprise
Huawei BFD configuration specification
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
局域网同一个网段通信过程
Title 1093: character reverse order
properties文件
Clear floating mode
Analysis report on development trends and investment planning of China's methanol industry from 2022 to 2028
Amazon Engineer: eight important experiences I learned in my career
GTSAM中ISAM2和IncrementalFixedLagSmoother说明
[Thesis code] SML part code reading
Linux regularly backs up MySQL database
请求转发与重定向
Station B Liu Erden - linear regression and gradient descent
Pay attention to the details of pytoch code, and it is easy to make mistakes
Station B Liu Erden linear regression pytoch