1. Introduction
Renewable energy systems such as solar and wind energy systems have been increasing rapidly in both industry and residential applications worldwide. Among the renewable energy systems, PV based energy generation systems stand out for being clean and quiet without rotating parts and having a reliable operation. The International Energy Agency-Photovoltaic Power Systems Program (IEA-PVPS) 2021 report shows that global PV capacity reached a milestone of 200 GW in 2015. Approximately, 140 GW of PV systems have been installed in 2020, and the total worldwide PV capacity by the end of 2020 reached well above 750 GW [
1]. The records suggest that solar energy exploitation has been increasing remarkably in recent years.
PV systems require inverters to interface the PV panels to the loads and/or the grid. The PV inverters are categorized into isolated and non-isolated types. Isolated inverters use transformers in dc and/or ac sides to provide galvanic isolation, so they block the leakage current in the panel, avoid voltage shock, and guarantee the safety. However, non-isolated transformerless inverters are at the center of attention in residential and medium power applications as they offer higher power densities and efficiency as well as lower weight and cost [
2,
3,
4,
5,
6,
7]. The Voltage Source Inverters (VSIs) are most attractive compared to their Current-Source counterparts due to better power densities, efficiency, and lower cost [
8]. Numerous single-phase transformerless inverters for PV applications have been introduced in the literature, and efforts have been made to catalog and review these topologies [
8,
9,
10,
11,
12].
The improvements in size, weight, and cost that transformerless PV inverters provide come at the cost of losing isolation and suffering from higher common-mode currents in the system. A group of transformerless PV inverters, categorized as H-bridge-based in [
8], imitate isolated inverters to some degrees and provide partial isolation through decoupling the dc side from the ac side by the use of switches. The HERIC and H5 inverters are two examples of this family and have been successfully commercialized [
13,
14]. The HERIC and H5, respectively, use two switches on the ac side and one switch on the dc side to decouple the dc side from the ac side during a zero-voltage state. This decoupling prevents the current flow to the dc side during the current freewheeling period and effectively reduces the common-mode current, electromagnetic interference (EMI), total harmonic distortion (THD), and the output filter size.
High reliability and fault-tolerant capability are important for mission-critical PV systems such as those on a space station and for the applications that have no access to affordable maintenance. For example, the need for highly reliable power converters in remote areas is pressing, since a fault in these systems may result in highly expensive or catastrophic system malfunctions. Therefore, the reliability in such applications is given a much higher priority than the overall cost of the system [
15] as the maintenance might prove to be difficult to carry out, if not impossible. Even though faults might occur in both PV arrays and inverters, it is reported in the literature that inverters are the most prone-to-failure parts of PV systems [
16,
17]. For instance, the Mean Time Between Repairs (MTBR) is guaranteed to be more than 20 years for modern PV arrays, while this parameter can drop as low as 2 years for an inverter [
18,
19]. Consequently, a number of fault-tolerant PV inverter topologies have been developed in the literature as efforts to mitigate the reliability concerns associated with PV systems [
18,
20,
21,
22,
23,
24,
25]. A spare-inverter approach was proposed for multi-inverter systems so that the additional inverter can replace the faulty inverter until it is repaired [
18]. This method, while applicable to multi-inverter PV systems, may not be very efficient when the PV system only consists of a single inverter as it significantly increases the costs of power converter implementation. The issue of using spare inverter versus using a fault-tolerant inverter to enhance reliability needs further investigation that is out of the scope of the present work. The method presented in [
20] adds four TRIACs to an H-bridge in order to provide it with fault tolerance and reconfiguration capabilities. The use of TRIACs adds to the complexity of driver circuits, and their low immunity to dv/dt calls for additional dv/dt filter, which increases the cost and size of the circuit. There are also fault-tolerant methods for multilevel PV inverters that are not applicable to two or three-level single-phase PV inverter systems [
21,
22,
23,
24,
25]. Hence, there is still a great need for studying fault-tolerant PV inverters to further enhance reliability and alleviate common-mode currents. The common-mode current in grid-tied PV systems has a direct relation with the capacitance of the stray capacitor of the PV panel and switching frequency [
26,
27].
To address this challenge, a new single-phase fault-tolerant PV inverter is proposed in this paper. The proposed inverter, called Integrated Fault-tolerant PV Inverter (IFTPVI), uses a redundant switch-leg, a fault-managing unit (FMU), and three electromechanical relays. In normal operation, the IFTPVI is in the form of a HERIC inverter. Under a faulty condition, the redundant switch-leg replaces the faulty switch-leg. If the bidirectional switch fails, the IFTPVI is reconfigured, and it takes the shape of an H5. All the mentioned actions are handled by electromechanical relays.
The rest of this paper is organized as follows: the configuration of the proposed inverter is defined in
Section 2 where the healthy and faulty conditions are discussed in detail. The fault diagnosis and elimination algorithm are given in
Section 3, which also illustrates different parts of the proposed fault tolerant structure. The reliability evaluation and Mean Time to Failure (MTTF) analysis are provided in
Section 4. In this section, the reliabilities and the MTTFs of the HERIC, H5 and the proposed inverter are thoroughly investigated. The numerical results and comparisons for reliability and MTTF analysis are given in
Section 5, and the simulation and experimental results are provided in
Section 6. Finally, the overall work is concluded in
Section 7.
4. Reliability and MTTF Analysis
One of the most prominent techniques among many analytical techniques [
28] for reliability analysis and evaluation is the Markov chain approach. The Markov method is applicable to the memoryless stationary systems.
As stated in [
29,
30,
31], the lifetime of a power electronic element can be divided into three periods. The first period, often referred to as the debugging phase, is the period that device is most likely to fail in and consequently has high failure rate due to manufacturing errors, improper design or errors occurring due to the operator. The hazard rate in this phase decreases with time. The debugging phase may be short or non-existent for power electronic converters. The second period is referred to as normal operating phase. The hazard rate in this period is almost constant and failures occur by chance. This phase is the only phase in which the exponential distribution is valid. The third phase is termed as fatigue phase in which the hazard rate accelerates with time.
In order to figure out the reliability of an inverter, a Markov chain model of the inverter can be developed. Then, the transition rates of the model should be extracted using the failure rates of the utilized components. Additionally, the differential equations corresponding to the probability of the system being in operational states in the Markov chain model are needed. Finally, the Mean Time to Failure (MTTF) and the reliability of the inverter, which is the sum of the time-dependent probabilities of the operating states in the Markov chain, are calculated. In the following subsections, the reliability analysis of the conventional HERIC and H5 inverter along with the proposed fault-tolerant PV inverter topology is presented.
4.1. Reliability of the Conventional HERIC and H5 Inverter
The conventional HERIC and H5 inverters are both two-state systems [
30]. The Markov chain model for these two inverters is shown in
Figure 9, where State I is the Up state (operating state) and State II is the Down state (failed or absorbing state).
For the sake of obtaining a more realistic model, the PV panels parallel capacitors C are also taken into consideration as shown in
Figure 2. The Markov chain model transition rates of the HERIC (
α) and H5 (
β) inverters are, respectively, given as
where,
,
, and
represent the failure rates of switch, diode, and dc link capacitor, respectively. For the sake of brevity, the devices are all assumed to be in their normal operating phase. Therefore, their reliabilities can be described with exponential distributions [
29].
In the initial state, both HERIC and H5 converters are assumed to begin operation in healthy condition. Thus, the initial condition matrix is
The Up states time-dependent probabilities for HERIC and H5 are, respectively, given as Equations (4) and (5).
Referring to Equations (1)–(5), the reliabilities of the conventional HERIC and H5 are obtained as
4.2. Reliability of the Proposed Fault-Tolerant Topology
The proposed fault-tolerant PV inverter can keep operating regardless of any kind of fault (OC or SC) in any of its semiconductor switches. The Markov chain model for IFTPVI is displayed in
Figure 10. State I is the healthy state, all elements and switches of the converters are healthy and operational. In State II, one of the H-bridge switches in the initial HERIC configuration has failed and its corresponding leg was replaced with the reserve leg. Furthermore, State III represents the situation in which one of the two switches in the bidirectional switch has failed and therefore, was replaced by S
r1. Finally, State IV is the absorbing state. Once the converter enters the absorbing state, it can no longer operate with the original characteristics or cannot operate at all. States I–III are referred to as Up states as the converter maintains normal operation in those states. Oppositely, State IV is known as the Down state since the converter in this state is considered failed.
The transition rates for the Markov chain model given in
Figure 10 are
where
and
are, respectively, failure rates of switch and diode for IFTPVI after a fault has occurred, the converter is reconfigured and is operating as HERIC or H5.
is the transition rate from state
i to state
j in the Markov chain model of IFTPVI shown in
Figure 10.
Pr is the probability of successful operation of the relay to remove the faulty part and to replace it by the reserve one. The failure rate of relays is a function of the number of on/off cycles. Thus, if a relay is switched only once in its lifespan, which is the case in IFTPVI, its hazard rate can be assumed to be zero [
32]. Therefore, it is assumed that once the relay is turned on, it will keep up the proper operation. In this reliability analysis, the value of
Pr is assumed 0.98 instead of 1 in order to have a more realistic analysis [
28,
31]. The maximum contact resistance for Omron MM series electromechanical power relays is 25
. These small relay contact resistances have a minuscule effect on the current and losses and, therefore, are ignored in the failure rate analysis. It is also worth noting that internal resistances of the reserve switches are assumed to be 5% higher than the main switches. This assumption is made in order to have a more realistic analysis by considering the component variations to a degree and to make sure that state I and state II of the Markov chain model in
Figure 10, which respectively correspond to IFTPVI operating as HERIC pre-fault and post-fault, would not be rendered identical due to the above simplifications.
Each state change corresponds to the occurrence of a fault in the PV inverter. The IFTPVI can tolerate an SC or OC fault in any of the semiconductor switches as long as the corresponding relay functions properly after the fault is detected through the FMU. Consequently, if a fault occurs while IFTPVI is in State I and the relay responsible for bypassing and replacing the faulty switch fails to switch correctly, the inverter will cease to operate and moves to the State IV, the absorbing state in the IFTPVI Markov chain model. After the first fault has already occurred and the system is reconfigured once, the other two scenarios to enter the absorbing state are the occurrence of a second fault in any of the switches and the dc link capacitor failure. Therefore, the capacitor failure rate should be taken into account in the transition rates. The system is considered initially operational; and thus, all the assumptions in
Section 4. A are valid here as well. The initial condition matrix is given as
The time-dependent probabilities for each of the states are determined using Equations (8)–(14).
As discussed, the (Up) state probabilities in the Markov model in
Figure 10 are
. Therefore, the reliability of the proposed fault-tolerant PV inverter is obtained as
Solving the differential equations in Equation (14) yields the time-dependent probabilities of the operating (Up) states as
4.3. MTTF Analysis of the IFTPVI, HERIC and H5 Inverters
Using the coefficient matrices in Equations (4), (5), and (14), the stochastic transitional probability matrix
P can be defined [
29] for the HERIC, H5, and IFTPVI. In matrix
P, each element
corresponds to the probability of transition to state
j after being in state
i for a non-zero time interval.
The truncated stochastic transitional probability matrix
Q can be obtained by omitting the row and column associated with the absorbing state in the stochastic transitional probability matrices given in Equations (19)–(21). The absorbing state for HERIC and H5 is State II, and the absorbing state for IFTPVI is State IV. Therefore, their corresponding truncated stochastic transitional probability matrices are
The MTTF can be defined as the average time passed before the converter enters the absorbing state, and can be calculated either by the integration of reliability over the time from the starting point of converter operation to infinity or by using the stochastic transitional probability matrix method. The latter obtains the MTTF through inversion of the matrix produced by subtraction of the truncated matrix Q from the corresponding identity matrix [
29]. The MTTF matrix (
M) of the HERIC and H5 inverters are, accordingly, calculated as
For of the sake of brevity, the stochastic transitional probability matrix method is used calculating MTTF matrix (
M) for the IFTPVI.
In the matrix
M,
Mpq indicates the average number of hours spent in state
q given that the system began operation from state
p [
29]. Consequently, the MTTF for the IFTPVI is obtained as
It is the sum of the average number of hours spent in each of the Up States (I–III) given that the converter begins operation in State I, the healthy state.
4.4. Failure Rate Analysis
Failure rate analysis is necessary to obtain the numerical results of reliability and MTTF of the converter. The method reported in [
30] is used to calculate the failure rates of the power electronic circuit components. Even though the approach in [
30] is criticized from a different point of view [
31], it is still used to compare relative MTTF ameliorations [
32]. According to [
30], the failure rates of the power electronic elements can be estimated using Equation (29).
where
,
, and
are, respectively, the failure rate of a component, the component basic failure rate, and the special factors that affect the failure rate of the component. These factors may vary for different circuit components. The factors affecting the considered power electronic components in this work are given in
Table 2 [
30].
Equations (30)–(32) provide the formula to calculate the π
T for diodes, MOSFETs, and capacitors, respectively [
30].
where
Ta is the ambient temperature, which is assumed to be 25 °C. Moreover,
(°C) is the junction temperature of the MOSFET or diode and it is calculated as
where
,
, and
are the case to ambient thermal resistance (°C/W), junction to case thermal resistance (°C/W), and power losses (W), respectively. Other factors affecting failure rates are given in
Table 3.
According to [
34], the conduction power loss of a MOSFET and a diode can be calculated using the electrical models as shown in
Figure 11.
Where
VT and
VF are MOSFET threshold voltage and diode forward voltage; and
and
are the MOSFET and diode on-state resistances, respectively. The aforementioned parameters are to be extracted from the component datasheets. Equations (34) and (35) respectively show the average power losses of the diode and MOSFET [
30,
31,
32,
33,
34,
35,
36,
37,
38,
39]:
where
,
,
,
,
and
are the instantaneous MOSFET or diode current, average conduction losses, average switching losses, MOSFET output capacitance, switching frequency, and the applied Drain-Source voltage, respectively. The conduction losses are calculated as the average losses in MOSFET or diode caused by their on-state voltage and internal resistance. The MOSFET average switching loss is calculated by the second term in Equation (35) [
30,
31,
32,
33,
34,
35,
36,
37,
38,
39], and the diode switching loss is ignored due to its small value.