1. Introduction
Monolithic power integrated circuits (ICs) are popular in small and medium power applications such as power modules for LED lighting and portable devices [
1,
2,
3]. High-voltage (HV) switching metal-oxide-semiconductor field-effect transistor (MOSFETs) and low-voltage (LV) control circuits are integrated in a single chip for not only reducing the circuit footprint but also decreasing the ICs package cost. Both the static power dissipation and the power efficiency can be improved accordingly. However, thermal management and reliability are significant challenges [
4,
5,
6,
7]. Most monolithic power ICs use lateral double-diffused MOSFETs (LDMOSs) as the switching devices. They are inferior to discrete switches, such as vertical double-diffused MOSFETs (DMOSs) or integrated gate bipolar transistors (IGBT) in that they have larger turn-on resistances
Ron [
8]. A larger
Ron implies greater power dissipation,
Pd =
IL2Ron. However, the temperature distribution is affected by many kinds of factors. As the number of transistors is more than 10,000 in a discrete power MOSFET IC, the transistor’s lifetime will decrease as the channel temperature exceeds 75 °C. Therefore, the on-chip channel temperature sensing is necessary to optimize the operations of power MOSFET.
A high-voltage power MOSFET that is constructed as a parallel MOSFET bank of multiple smaller transistors is called a large array device. The transistors are arranged in a sophisticated manner so that load current is uniformly distributed to them in parallel. Many layout styles, including multi-fingers, waffle, overlapping circular gate, wave, and diamond styles have been developed. However, the uniform distribution of the load current is still difficult to guarantee [
9].
Figure 1 shows an infrared thermal photograph of a 45 V N-type power MOS with a power dissipation of 1 W. The temperature at the substrate ring is 53.3–55.2 °C, and that at the top metal above the transistor channel is 42.4–46.1 °C. However, both the temperatures are not the exact temperature in the transistor channel. Back-end layers could cause the infrared thermal method of temperature measuring to yield erroneous results owing to die surface condition and radiative transfer [
10]. As a rule of thumb, every 10 °C increases in the channel temperature decreases the lifetime of semiconductor device in a half [
11].
Thermal resistance model is a popular method to derive channel temperature from the case or the ambient temperatures, but it is an average value. The finite element (FE) thermal model simulates the thermal characterization of high-power modules by using RC network for thermal impedance [
12]. It is effective to design heat dissipations in the presets, if the equivalent RC network is completely tracked. Some on-chip temperature sensors have been proposed to measure exact temperature from solid-state devices. These sensors are inherently sensitive to temperature. Diodes have been used in their analog counterparts because the yield voltage is a positive temperature coefficient due to reverse saturation current
IS [
13]. However, the diodes and analog to digital converter (ADC) are not suitable for the high-resolution power IC applications owing to their large layout size, nonlinearity characteristics, and large temperature variations in measurements. In [
14], a diode is used and an ADC is applied to perform the readout. Moreover, the system requires the area overhead and has to work at over 100 °C. In some cases, bandgap-based voltage reference circuits are used as their voltages are proportional to absolute temperature (PTAT) [
15]. A differential temperature sensor is used to determinate the efficiency of Radio frequency (RF) linear power amplifiers [
16]. In addition, a voltage-to-frequency converter is used to increase the resolution and simplifies the calibration method. Apart from above, even in most digital ICs, ring oscillators as temperature sensors are used due to delay lines, which depend on device temperature [
17,
18,
19,
20,
21,
22,
23]. As temperature sensors need to be embedded near the channel of power MOS, the ring oscillators and bandgap-based sensors are not suitable for MOS bank because of latch-up rule violation and the layout area in consideration. Temperature balancing technologies are common in the multi-core processor for controlling the performances of each core and further reducing and balancing the temperatures. In [
24], the strategy of the equalizing temperature in integrated circuits has been reported to the optimum choice of the stabilized chip temperature. However, the technique is seldom used in power applications.
In this paper, a method of thermal balance with on-chip temperature monitoring is proposed for monolithic power ICs. The method identifies the hottest of MOSFET bank and reduces its power and thermal dissipation. The temperature-sensing device is an on-chip poly-silicon strip. A temperature-to-frequency converter (TFC) is developed to quantize an on-chip temperature as a digital code. For considering the chip area and balancing the control bits, the power MOSFET architecture can be divided into eight banks to monitor the channel temperature in the bank level. An average temperature monitor is proposed because the temperature will be saturated to a DC value after the chip working for a while. Based on the output of the temperature sensor, the hottest device can be identified. The equivalent duty cycle can thus be adjusted.
The rest of the paper is organized as follows.
Section 2 develops an on-chip temperature monitoring technique.
Section 3 introduces the pulse-width-modulation (PWM) methodology for temperature balancing.
Section 4 details the corresponding circuit implementation and measurements made on a 0.25 μm 1P3M test chip. Finally,
Section 5 draws conclusions.
2. Materials and Methods Proposed On-Chip Temperature Monitoring
Figure 2 presents the system block diagram of the proposed on-chip thermal balancing architecture, which is composed of two parts. The right part includes a temperature sensor with eight MOSFET banks in a parallel, an 8-1 Mux, and an RC oscillator (OSC). Each MOSFET bank has an individual gate control signal and a temperature sensing poly resistor. The 8-1 Mux is used to select one resistor for sensing the local temperatures and is connected to the OSC. The left part includes the thermal balancing controller, which is used to perform thermal balancing, which is verified by a field programmable gate array (FPGA) and the principle is introduced in
Section 3.
The sensing device in our temperature sensor is a poly-silicon strip, which can be placed next to the drain/source of the MOSFET bank under consideration.
Figure 3 presents the eight parallel banks. A poly-silicon strip (thick gray line) is embedded in the middle of each MOSFET bank. Each bank has a total width of 5000 μm (50 μm × 50 fingers × 2 blocks), and the layout of poly and oxide diffusion (OD) layers as shown in
Figure 4. The embedded sensors use to measure each of the MOSFET banks and notify the temperature variation within neighboring cells.
The change in resistance of the integrated resistor with temperature is
where
RO is the resistance at room temperature,
TC1 and
TC2 are the first and second-order temperature coefficients, and Δ
T is the change in temperature. TSMC design guideline indicates that Δ
R/RO equal to −9% in a poly resistor for any change in temperature (Δ
T) varies from 25 to 150 °C, whereas
TC2 has a small effect on resistance because
TC1 >>
TC2 (larger than two orders). The simulated temperature coefficient is
Figure 5 shows the detail circuitry of the proposed TFC, which is a stable multi-vibrator where
RT is a poly resistor of the poly-silicon temperature sensor and a capacitor (
C) is added to adjust the frequency of the OSC. The OSC includes two rail-to-rail comparators and an set-reset (SR) latch. The oscillation period is as follows.
For example, Vdd = 2.5 V, Vlow = 0.15 V, and Vhigh = 2.35 V and set, so τ = 5.5 × RT × C and f = 1/τ. The design parameters are set as follows. The poly-silicon strip is 2 μm wide and 480 μm long with an equivalent resistance of 41 kΩ and C is 8 pF. According to Equation (3), the oscillation period is 1.8 μs, and the oscillation frequency is 554 kHz.
Based on Equation (1), the frequency or the period can be obtained to determine
RT if
C is known. The measuring time determines the accuracy of temperatures, which can be derived as follows. Suppose that
τmin and
τmax are the minimal and maximal periods, according to Equation (1), at maximal and minimal operating temperatures,
Tmax and
Tmin.
N is the resolution and the number of temperature levels between
Tmax and
Tmin. The measuring time
τmeas must satisfy
If
τmeas = (1 ±
x)·
τo, then
τo is the period in which the average of oscillation period is determined, and
x is the maximal deviation from the average,
For example, if τo is 1 μs and x is 5%, then τmin = 0.95 μs and τmax = 1.05 μs. For a given resolution, say N is equal to 1000, and for a fixed value of τmeas = 10 ms, the maximal count of τmeas/τmin is 10,526, and the minimal count of τmeas/τmax is 9524. In this case, a 14 bit counter is required to count maximal and minimal value of measured frequency.
Unfortunately,
C,
RT, and
x are difficult to predict precisely. Process variation is another critical challenge in temperature sensing. As shown in
Figure 5, all of the temperature sensing resistors (
RT0–7) share the same capacitor, comparators, SR latch, and inverter in the TFC circuit. Ideally, the oscillation frequency is determined only by
RT. Suppose that MOSFET banks
Mi and
Mj have temperature
Ti and
Tj, respectively. With a negative temperature coefficient, once the oscillation frequency
fi is higher than
fj, then
Ti is higher than
Tj. Since the oscillation frequency depends on the sensing resistor
RT and capacitor
C, a target of the frequency range can be designed with the size of the poly-silicon strip and the capacitor. The sensing resistor is designed related large as compared to the parasitic resistance of interconnections and TFC switches. As a result, we assumed that
RT >>
RP. In addition, the temperature sensitivity of the sensing resistor is larger than the parasitic resistance of the capacitor. The frequency change is mainly related to the sensing resistor. Thus, the variation of parasitic resistances and capacitor can be ignored in the TFC circuit. However, process variation and circuit mismatch affect the results from metal routing paths, switching devices, and sensing resistors, even at a fixed temperature. Consequently, a mechanism to calibrate process variation and mismatches is developed.
3. Calibrations and Thermal Balancing Methodology
Since the poly resistor determines the frequency of the TFC, as a function of temperature, the power MOS operation with self-heating can balances the channel temperature in each bank by pulse-width-modulation (PWM) methodology.
Fi is the total number of frequencies recorded by frequency count module (FREQCount) and calibrate module (CALIB) is a calibration module that overcomes accuracy problems related to process variation and circuit mismatch. MAXSearch finds the MOSFET bank with the highest temperature, whereas PWMMod reduces the number of PWM pulse of the hottest bank to lower down its temperature.
Figure 6 shows the flowchart of the proposed thermal balancing method. The above loop will keep iterating whenever power and PWM signal are provided.
The circuit mismatch is suppressed by the TFC circuit with an oscillator and eight sensing resistors, and a subtractor of the calibration method is expected to cancel the process mismatch. In the calibration module, Δ
Fi is the number of changed frequencies, recorded and compared with
Fi, whereas Δ
Fi is defined as Δ
Fi =
Fi −
CFi.
CFi is the number of frequencies of the
ith MOSFET bank at the calibration time when power devices are not functioning.
Figure 7 displays the block diagram of the calibration module in
Figure 2. It consists of two register files,
CF0–7 and
F0–7.
CF0–7 store the number of calibration frequencies. At the beginning of CALIB = 1, the power MOS is turned off, and the sensing resistors are sequentially selected. The frequency of the TFC is recorded in
CFi. In the operation mode and measuring (MEAS) mode of CALIB = 0, the frequencies are counted, and the number is recorded as
Fi. Finally,
CFi is subtracted from
Fi, yielding Δ
Fi for comparison of the channel temperatures. For example, the total number of frequencies is stored in a counting time of 1 ms, the oscillator through a sensing resistor,
RT2, working at
T = 25 °C and
T = 100 °C (after heating). The results are subtracted,
F0,100 °C –
CF0,25 °C, yielding the temperature level in transistor bank 2 (TX2).
The hottest bank is defined as the one with the largest Δ
Fi. After the hottest device has been identified, a PWM modification scheme for that bank is employed. A PWM signal is a repetitive pulse stream that periodically turns on devices. To reduce the temperature of the hottest bank, one pulse is suppressed in every
pulse, as shown in
Figure 8. In
Figure 8, TX2 is assumed to be the hottest bank. The gate control for TX2 has an effective duty cycle of 3/4 ×
D if one out of four pulses is suppressed. Therefore, the power dissipation of TX2 is 3/4, reducing the temperature.
However, decreasing the effective duty cycle for TX2 will reduce the overall power output. In a closed loop system, the feedback mechanism is activated to increase the duty cycle to maintain the overall power output level. Suppose that
N numbers of MOSFET banks are in parallel and the original and the final duty cycle are
D0 and
D1, respectively. The overall duty cycles of all
N devices, or total output current, must be equal to each other before and after PWM modification. Therefore,
The left- and right-hand sides of Equation (7) are the accumulated duty of all devices. According to Equation (8), the adjusted duty cycle is greater than the original. Consider for example, K = 4, N = 8, and D0 = 0.25; the adjusted duty cycle D1 = 0.258, increasing by 0.8% after PWM modification. Accordingly, TX2 has an effective duty cycle of 0.194 (3/4 × 0.258), which is 78% of the original 0.25. The remaining seven banks share the suppressed pulse of TX2, with an increase in the adjusted duty cycle of 0.8%.
4. Experimental Results and Discussion
To demonstrate the feasibility of the proposed methodology, a test power chip and a test board are used. An on-chip temperature sensor is implemented using TSMC 0.25 μm BCD processes.
Figure 9 shows the die photograph of the proposed test chip. The test chip includes eight LDMOS with individual gate controls and a common source and drain, an 8-1MUX, and an OSC.
The experiment is composed of the following five steps. First, cad tools, R3D (Silicon Frontline, San Jose, CA, USA) and ANSYS Multiphysics (ANSYS, Canonsburg, PA, USA) are used to simulate the thermal effects of power MOS transistors and the distribution of temperature in MOSFET banks. Second, the post-layout simulation is performed to verify the TFC conversion mechanism. Third, the chip is put into a lab oven to validate and calibrate the temperature coefficients according to the RC OSC. Fourth, the CALIB and MEAS modes are conducted to count the oscillation frequencies with 0–1.5 W power dissipation. Finally, the PWMMod reduces the effective duty cycle for the hottest device. It confirms that the proposed algorithm reduces the peak temperature and any kind of variation in temperature for eight MOSFET banks.
4.1. Power and Thermal Simulation
Based on the power MOS layout in
Figure 10a,
Figure 10(b–d) show the power MOS simulated IR drop, power dissipation, and temperature distribution, obtained using R3D, based on a power dissipation of 1.5 W (
ID = 2.14 A,
VDS = 0.7 V at 25 °C). They show power density in W/μm (
Figure 10b), IR drops of drain voltage distribution (
Figure 10c), and source voltage distribution in the top metal layer (
Figure 10d), and the overall temperature distribution in silicon (without metal consideration) is also shown. These figures indicate that the maximal power dissipation mismatch is 6.7%. The power dissipation is 1.5 W uniformly to eight parallel MOSFET banks without metal wiring. The distribution of power dissipation, obtained by R3D, is fed into ANSYS to conduct a thermal analysis.
Figure 11 shows the results of ANSYS simulation related to the distribution of power dissipation results. The maximal temperature difference between the banks is 3.4 °C, where TX1 is the hottest bank and TX4 is the coolest bank.
Figure 10 shows the hot spot with common-centroid distribution, which can be achieved by the symmetrical layout and the power devices placements theoretically. However, the temperature distribution is formed by many factors such as power dissipations, heat source locations, and thermal sink device areas [
24]. In practical cases, the power MOSFET is operated under non-ideal conditions, such as turned-on uniformity, bonding wire consistency, and chip and package symmetry. Therefore, the hot spot is transferred to the left side of the chip in
Figure 11 due to the location of the power MOS.
Based on the 3D simulation results, we can predict the expected temperature imbalances to determine the sensitivity and resolution of the implemented sensors as shown in
Figure 11. The simulation shows the temperature distribution based on the boundary conditions of specific power dissipations, volumes, and material properties. Moreover, conditions of the power dissipations of banks, the places of heat sources, and the sizes of thermal sink components can be changed to make the analyses more accurate, whereas the routing metal, bonding wire, packaging material, and testing Printed circuit board (PCB) are the part of the thermal sink devices. However, these boundary condition factors would result in the difficulties of the analyses. Moreover, these factors would not add new data regarding to the expected temperature imbalances.
4.2. Temperature Coefficient Extraction
To verify the RC OSC and the temperature coefficient of the sensing poly-resistors (
RT), the test chip is placed in a temperature-controlled lab oven. The oscillation frequency is 508.2 kHz when
Vhigh = 2.35 V and
Vlow = 0.15 V.
Table 1 shows the total number of frequencies
Fi and frequencies increased Δ
Fi, measured by using the FPGA board. The first row presents the set temperatures. The second row presents the measurements of frequencies obtained by FREQCount and CALIB from FPGA. The third row is the calibration result (Δ
FT =
FT −
CFT). The temperature coefficient is derived as follows.
From Equations (2) to (3), it can be inferred that the frequency depends on the resistance as follows: (
Fi −
CFi)/
CFi (
RTi −
ROi)/
ROi. The measured temperature coefficient
TC1M is −0.0716%/°C and the simulated temperature coefficient
TC1S is −0.072%/°C. Since the measurement time is 83.34 ms, the desired
TC is 396 Hz/°C (
TC/83.34 ms). The derived
TC is a linear approximation of an exponential RC OSC.
Figure 12 shows a plot of temperatures in the horizontal axis and frequency in the vertical axis, the frequency of the eight temperature sensors as a function of temperature before and after the calibration. In
Figure 12b, we can see the dispersion between sensors and linearities, fitting the baseline of
Table 1.
Based on these results, the frequencies offsets of the resistance mismatches can be obtained in
Figure 12a. The temperature coefficient TC can be estimated according to the calibration in pre-measurement process. Therefore, the absolute temperatures can be obtained by the calibration. The probability errors of the measurements are about 1% of the resistance mismatch for relative errors and ±30% of the resistor tolerance for absolute errors. Despite the research on the maximal temperature address of the MOSFET bank, the absolute temperature information is not necessary. The calibration
ΔF compensates the frequency error by the subtraction method.
4.3. Experiments and Discussion
Figure 13 shows the experiment that involved a channel thermometer. In this experiment, measurements were made by forcing the continuously dissipated power into the power MOS using an Agilent N6705B source meter. The FPGA board recorded the total number of frequencies.
Table 2 lists the measured temperatures based on
TC = 33/°C; the counts after calibration are shown in parentheses. For comparison with each bank, the maximal and minimal temperatures are 102.7 °C and 93.1 °C. ANSYS simulations yield corresponding values of 98.55 °C and 95.34 °C. The maximal temperature difference is 9.6 °C (TX7 vs. TX4). The proposed method is highly accurate because it involved calibration using a lab oven.
4.4. Thermal Balancing by PWM Modification
To verify the effectiveness of the thermal balancing method, the following experiment was conducted. The LDMOS was connected to a 6705B DC Power Analyzer (Keysight, Santa Rosa, CA, USA) to force a dissipation of
PD = 0−1.5W, and each gate was controlled by a 50% duty signal. To provide one pulse room for PWM modification, one pulse out of eight was reduced, as shown in
Figure 14a. This will allow another seven pulses to turn on.
Table 3 lists temperature measurement results before thermal balancing. Since the MOS power dissipated at 0–0.5 W, the measurement results for each bank in
Table 3 were close to each other and a significant temperature difference of 9.5 °C at
PD = 1.5 W.
Since TX7 and TX0 were the hottest banks at
PD = 1.5 W, their effective duty cycle was reduced to 1/2 × 50% from 7/8 × 50% by PWMMod (as one of every two pulses was suppressed). To maintain the overall duty cycle, the duties for coolest banks TX1–6 are increased to 8/8 × 50% (without suppressed).
Figure 14b plots each gate signal after PWM modification. The original active duty cycles are all 7/8 × 50% as shown in
Figure 14a. Moreover, for maintaining the equal electrical characteristics of the power MOS, seven out of eight banks of the power MOS are turned on during each cycle.
Figure 15 shows the temperature profiles with and without PWM modification. The temperatures of TX0 and TX7 are reduced, whereas the temperatures of TX1-6 are increased because their duty cycles are different from TX0 and TX7. The PWMMod yield the PWM signals without suppressing the duty cycle for 8/8 × 50% from 7/8 × 50% of TX1–6. Therefore, the original temperature difference of 9.5 °C is reduced to 2.8 °C (99.2–96.4 °C).
4.5. Lifetime Improvement
This paper shows that the equalized temperature in eight banks of power MOS. Based on the studies in [
25,
26], the lifetime can be predicted according to the temperature acceleration model as follows.
where
AFT is the temperature acceleration factor.
Ea is the activation energy, which is assuming to 0.7 eV for the silicon junction defect, and
k is the Boltzmann constant is 8.617 × 10
−5 eV. A lifetime test for reliability requires
= 1000 h (0.34 y) at an operating channel temperature of
TOP = 150 °C. Herein,
TS is the stress temperature, and
is the obtained lifetime.
Table 4 presents the estimated lifetime based on
Figure 15. The worse bank determines. Therefore, the lifetime of the power MOS can be improved by 20%, while the balancing method changes the worst bank from TX7 to TX1.