Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring

Zhang, Zhange; Ren, Jiajun; Peng, Renju; Qu, Yufu

doi:10.3390/app12073430

Open AccessArticle

Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring

¹

School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China

²

Zheku Technology (Shanghai) Co., Ltd., Shanghai 201210, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(7), 3430; https://doi.org/10.3390/app12073430

Submission received: 2 March 2022 / Revised: 21 March 2022 / Accepted: 25 March 2022 / Published: 28 March 2022

(This article belongs to the Special Issue State-of-the-Art of Optical Micro/Nano-Metrology and Instrumentation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Through-focus scanning optical microscopy (TSOM) is one of the recommended measurement methods in semiconductor manufacturing industry in recent years because of its rapid and nondestructive properties. As a computational imaging method, TSOM takes full advantage of the information from defocused images rather than only concentrating on focused images. In order to improve the accuracy of TSOM in nanoscale dimensional measurement, this paper proposes a two-input deep-learning TSOM method based on Convolutional Neural Network (CNN). The TSOM image and the focused image are taken as the two inputs of the network. The TSOM image is processed by three columns convolutional channels and the focused image is processed by a single convolution channel for feature extraction. Then, the features extracted from the two kinds of images are merged and mapped to the measuring parameters for output. Our method makes effective use of the image information collected by TSOM system, for which the measurement process is fast and convenient with high accuracy. The MSE of the method can reach 5.18 nm² in the measurement of gold lines with a linewidth range of 247–1010 nm and the measuring accuracy is much higher than other deep-learning TSOM methods.

Keywords:

deep learning; dimensional measurement; computational imaging; through-focus scanning optical microscopy

1. Introduction

Metrology and defect inspection are significant steps in the fabrication of semiconductor integrated circuits and nanoscale devices industry. Several methods are currently used for nanoscale measurement, such as scanning electron microscope (SEM) [1,2] based on electronic excitation scattering measurement and atomic force microscope (AFM) [3] based on probe measurement. Although these above methods show high measuring accuracy in measurement, they are difficult to be applied to online real-time detection due to their low measuring efficiency, high cost and destructions. Additionally, methods based on optical scattering are unable to achieve accurate nanoscale measurement [4] due to the limitation of diffraction limit of optical imaging system though they are fast and efficient.

Through-focus scanning optical microscopy (TSOM) is a novel, fast and non-destructive micro/nano-scale measurement method based on computational imaging [5,6,7,8,9]. Different from traditional optical measurement, TSOM is based on the view of “what you get is not what you see”. TSOM captures a series of images consisting of a focused image and a lot of defocused images of the imaging target by scanning along the optical axis. Captured images are stacked according to spatial position to form an image cube (TSOM cube), and the middle sectional view of the TSOM cube is taken to generate the TSOM image. The features of the TSOM image are sensitive to the nanoscale size change and the nanoscale detect of the measuring target, for which the parameters of the measuring target can be retrieved from the TSOM image. By using TSOM, the diffraction limit of optical imaging system can be avoided so that the ordinary wide field optical microscope can achieve the dimensional measurement in nanoscale. TSOM has been applied in mask-defect detection [10], nanoparticle shape evaluation [11], nanoscale dimensional measurements of three-dimensional structures [12] and overlay analysis [13,14].

Three kinds of TSOM methods are used to measure the nanoscale parameters of the target: the library-matching TSOM method [15,16], the machine-learning TSOM method [17] and the deep-learning TSOM method [18,19].

Library-matching TSOM method is a classical measurement method of TSOM [15,16]. In this method, the experimental TSOM image of the target is matched with the simulated TSOM image library established by simulation algorithm. Then, the parameters of the target can be recorded from the best match. Library-matching TSOM method need much time for establishing the simulated TSOM image library, and its measuring accuracy is limited by the fineness of the established library.

Qu et al. proposed the machine-learning TSOM method according to TSOM image features [17]. In their method, three kinds of image feature extraction algorithms are used to extract TSOM image features: the gray level co-occurrence matrix (GLCM), the local binary pattern (LBP) and the gradient direction histogram (HOG). Then, the extracted TSOM image features are input into different machine-learning models in different combinations for training and measuring. Finally, they get more accurate measurement results than the library-matching method. However, due to the limitation of feature extraction algorithms, the machine-learning TSOM method is still not accurate enough.

Deep learning has developed rapidly in recent years and has been widely used in various fields, such as image recognition [20,21], biomedical sciences [22,23] and computational imaging [24]. The earliest deep-learning TSOM method proposed by CHO et al. is based on the idea of image classification [18]. With setting different parameter sizes as the classifying categories, this method uses a single-column Convolutional Neural Network (CNN) to calculate the probability of the TSOM image being assigned to each category. Finally, they obtain the measurement results from weighted average of all categories. The method is simple and convenient, but the measurement range is small and the accuracy is limited. Nie et al. used ResNet50 and DenseNet121 models for TSOM measurement [19]. They achieve a larger size range of nanoscale measurement compared to CHO’s, but its accuracy is still limited as a classification method.

The common deficiency of the above three kinds of TSOM measurement methods is that only a single TSOM image is used to retrieve the measurement parameters. That means only a sectional view of the entire TSOM cube is used, which leads to the low image information utilization and further leads to the limited accuracy of these TSOM measurement methods. JOO et al. studied the relationship between the TSOM height and TSOM volume with the defects of nanodevices, indicating that there is a regression relationship between the entire TSOM cube and parameters of the measuring target [25]. This conclusion indicates that parameters of the measuring target can be retrieved by inputting more images of the TSOM cube to improve the accuracy rather than inputting only a sectional view. Since only the defocus distance between different images that constitutes the TSOM cube changes, and the focused image has the highest definition and contains more information relatively among the TSOM cube, a single focused image is the most suitable to be added as the supplementary input of TSOM image.

On this basis, a two-input deep-learning TSOM method based on CNN is proposed in this paper. The two-input CNN uses the focused image and the TSOM image as two inputs (instead of using only a single TSOM image as previous TSOM methods) and uses the measuring parameters of the target as the output to make full use of more effective information of the TSOM cube. Additionally, the regression model is used to solve the problem that the measurement accuracy is limited by the classification categories.

The structure of the rest of this paper is as follows: the second part introduces the materials and methods, including the process, the experimental devices, the acquisition of the dataset, the structure of the proposed two-input CNN and how to train it. The third part introduces our experiment, including model-evaluation indices, the measurement results, analysis, and the uncertainties of the model. The fourth part introduces our discussion, mainly concerning the effect of focusing position error and the influence of the shift of the measuring target. The fifth part introduces our conclusion.

2. Materials and Methods

The process of the two-input deep-learning TSOM method is shown in Figure 1. Firstly, the TSOM imaging system is built to capture experimental images. Secondly, by scanning along the optical axis, a series of images of the measuring target and the background of the imaging region are obtained, respectively. After the background images are subtracted from the target images, a series of images of the target removing the background noise are obtained, consisting of a focused image and lots of defocused images. These images are stacked to form an image data cube (TSOM cube) according to their spatial positions. Then, the image definition of all images in the TSOM cube is evaluated to obtain the best focused image. At the same time, a sectional view of the TSOM cube is extracted and processed by smoothing, interpolation and pseudo-color methods to generate the TSOM image. Next, the obtained focused images and TSOM images are input into the two-input CNN network for training and testing. Finally, the network outputs the measuring parameters of the corresponding target.

Section 2.1, Section 2.2, Section 2.3 and Section 2.4 introduce the TSOM imaging system, the acquisition of the dataset, the network structure and model training in detail, respectively.

2.1. TSOM Imaging System

The experimental images are acquired by TSOM system imaging system shown in Figure 2. The system is divided into two parts: the illumination system (the red-light path in Figure 2a) and the imaging system (the blue-light path in Figure 2a). In the illumination system, LED with wavelength of 520 nm is used as the light source. The light emitted by LED passes through the objective lens (OL1), an aperture, a lens (L1), polarizer, the field diaphragm (FD), lens (L2) and the splitting prism, and then converging on the conjugate rear focal plane of the objective lens (OL2). The light passed through the objective lens (OL2) irradiates on the nanostructured target to achieve Kohler illumination. The imaging system consists of an objective lens (OL2), a splitter prism and a lens (L3), which help in imaging the target on a CCD camera. While through-focus scanning, the objective lens (OL2) is controlled to scan along the optical axis by the piezoelectric objective locator (PZ) and the piezoelectric position mechanism. Background images and target images are captured every 200 nm during the scanning process. After the background images are subtracted from the target images, a series of images of the target without background noise are obtained, consisting of a focused image and lots of defocused images.

2.2. The Dataset

The image dataset is captured by the measuring target composed of a series of isolated gold lines (Au) with a length of 100 μm and height of 100 nm. The measuring parameter is the linewidths (LW) of the gold lines. The model of the gold line is shown in Figure 3. The linewidths to be measured ranged, increasing from 247 nm to 1010 nm with a total of 37 sizes. The results of scanning electron microscopy (SEM) are used as the truth values of the linewidths of the gold lines.

The gold lines are placed on the displacement table, and the TSOM imaging system built in 2.1 is used to obtain a series of images of the lines including a focused image and many defocused images. Then, the images are stacked in spatial positions to form a three-dimensional cube (TSOM cube), as shown by the yellow cube in Figure 4.

The pixel size of each image as shown in Figure 5 to form the TSOM cube is 89 × 89 pixels, and each gold line to be measured locates in the middle of each image. The image closer to the focused position has higher definition, and the definition of the image gradually decreases with the increase in the defocused distance.

A focused image and a TSOM image used to construct the dataset could be extracted from a TSOM cube. There are 10 identical gold lines in each linewidth (making a total of 370 gold lines). In total, 1000 different positions on the 10 identical gold lines are selected randomly to collect 1000 TSOM cubes of each linewidth. Therefore, 1000 focused images and 1000 TSOM images are captured from each linewidth (i.e., from 10 identical gold lines). The entire dataset consists of 37,000 focused images and 37,000 TSOM images. A TSOM image corresponds to a focused image, both of them are labeled with a corresponding linewidth. The dataset is divided into a training set, a validation set and a test set in a 3:1:1 ratio.

Section 2.2.1 and Section 2.2.2 introduce the generation of focused images and TSOM images in more detail.

2.2.1. Focused Image

In order to obtain the best focused image from the TSOM cube, it is necessary to evaluate the image definition of the images within TSOM cube as Figure 6 shows. The image with the highest definition is selected as the focused image of the cube for constructing the dataset.

The general methods used for image focusing and definition evaluation [26] include Fourier transform, gradient energy maximization, high-pass filtering, histogram entropy, local change histogram and normalized variance, etc. This paper uses normalized variance as the focusing evaluation function to evaluate the sharpness of the image sequence for reducing the influence of noise. The expression of the normalized variance is shown in Equation (1). Generally, the higher the definition of the image, the larger the value of the focusing evaluation function appears.

F_{normed variance} = \frac{1}{H \cdot W \cdot μ} \sum_{H} \sum_{W} {(f (x, y) - μ)}^{2},

(1)

where H and W are the height and width of the pixel size of the image, f(x,y) is the brightness value of the pixel in the image whose coordinate is (x,y) and μ is the average brightness value of all pixels of the image.

The focusing evaluation function values are calculated one by one for each series of images corresponding to different linewidths. The image corresponding to the peak value of the function is selected to be the focused image of the image series. The focusing evaluation function curves of image series corresponding to gold lines with linewidths of 247 nm, 298 nm, 625 nm and 768 nm are shown in Figure 7. Taking the evaluation result of the gold line with 247 nm linewidths as an example, it can be seen from the peak position of the focusing evaluation function that image no. 96 shows the highest definition, then it is selected as the best focused image of the image series.

It should be noted that the focusing evaluation function curve may shows two peaks due to the uneven height of the measuring target and the incomplete ideal lighting system [5,27] when evaluating the definition of the image series, such as the function curves of the image series corresponding to the gold lines with linewidths of 625 nm and 768 nm as Figure 7c,d shows. In this case, the larger of the two peaks is selected as the best focusing position.

Finally, a total of 37,000 focused images are obtained, each focused image is shown in Figure 5a (Section 2.2) with size of 89 × 89 pixels, and the gold line to be measured is located in the middle of the image.

2.2.2. TSOM Image

Another component of the dataset is the TSOM image. The TSOM image is obtained from the transverse sectional view of the TSOM cube, and its generation process is shown in Figure 8. Taking the best focused image obtained in Section 2.2.1 as the center, 44 images are taken from each side to the left and right to form a small TSOM cube (the cube consists of 89 images including a focused image and 88 defocused images), and the transverse sectional view in the middle of the cube is taken to be spliced as shown in Figure 8a. Finally, the TSOM image is obtained by interpolation, smoothing and pseudo-color processing of the spliced sectional view. Each line of the TSOM image represents the light intensity information of a defocused image or the focused image.

Finally, 37,000 TSOM images are obtained, each with a size of 89 × 89 pixels as shown in Figure 9, and each TSOM image corresponds to a focused image.

2.3. The Structure of the Two-Input CNN

The structure of the two-input CNN proposed in this paper (the following is named Focus & TSOM-CNN) for TSOM measurement is shown in Figure 10. Focus & TSOM-CNN, from left to right, includes an input part, processing of TSOM images and focused images part and the feature merging and output part.

The left area of the network mainly contains the input layer, which is used to input the TSOM images and the focused images into the network. The two kinds of images are 89 × 89 pixels in size.

The middle area of the network as Figure 10 shows is the processing part of the two types of input images, and the upper part is the processing network of TSOM images. Since TSOM images contain relatively more effective information, the TSOM image processing network is constructed by referring to MCNN [28] which integrates features of different scales in order to fully extract the features of the input image. After the TSOM image is input, three columns of convolution channels are set for processing and three sizes of convolution kernels of 3 × 3, 5 × 5 and 7 × 7 pixels are used to extract features of different scales in the three channels, respectively. Considering the small pixel size of TSOM images, 3 convolution layers are set in the first 3 × 3 convolutional channel and the convolutional stride is set to 1, two convolution layers are set in the other two channels and the convolutional strides are set to 2. Both of three channels use the zero padding in the convolutional process. In both three channels, in order to retain the original characteristics, the pooling layer in the network is maximum pooling and the activation function is ReLU function:

f (x) = \max (0, x)

. After feature extraction of the three convolution channels, the first 3 × 3 size convolutional channel obtains features with the size of 23 × 23 × 96, others obtain features with the size of 6 × 6 × 96, respectively. The features with the size 6 × 6 × 96 are increased to 23 × 23 × 96 by zero padding. Then, all the features are merged as the TSOM image features with the size of 23 × 23 × 288 pixels.

The lower part of the middle area of the network is the processing of the focused image. Considering the texture of the focused image is relatively simple and effective information is concentrated in the middle area of the image, this part of the network refers to AlexNet [29] using single column convolution channel. The focused image features are extracted through three convolution layers and two pooling layers after input. The size of the convolution kernel is set to 3 × 3 pixels. The pooling method adopts maximum pooling and the activation function is ReLU function. After the process, the size of the obtained features of the focused image is 23 × 23 × 96 pixels.

The right area of the network is the feature merging and result output part. After feature extraction of the TSOM images and the focus images is completed, the extracted features of the two types of images are merged, and the size of the merged features is 23 × 23 × 384 pixels. Then, the merged features are resized and mapped to the measuring target (linewidths) through the two fully connected layers for output. The activation function of the fully connected layers is ReLU function.

The adaptive moment estimation (Adam) is used as the optimizer with a beginning learning rate of 0.0001. Drop-out is used to avoid overfitting. The images in the datasets are normalized by the zero-mean normalization (Z-Score) method before input. The definition of Z-Score is shown in Equation (2).

x_{new} = \frac{x - u}{σ}

(2)

where

u

is the mean of the data x and

σ

is the standard deviation of the data x.

2.4. Training

Focus & TSOM-CNN belongs to the deep-learning regression model; therefore, the mean square error (MSE) is used as the loss function for training. The definition of MSE is shown in Equation (3).

MSE (y, \hat{y}) = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2},

(3)

where

y_{i}

indicates the truth value to be measured in each measurement,

{\hat{y}}_{i}

indicates the predicted value and n indicates the number of measurements.

The batch size is set to 16 and the model is trained for 200 epochs.

3. Experiment

In order to evaluate the performance of the proposed method in measurement after training and testing, our experimental results are compared with two other regressive deep-learning TSOM methods: DenseNet121 model and ResNet50 model in regression.

3.1. Evaluation Indicators

In this paper, MSE and MAE, commonly used in regression models, are used to evaluate the trained models. The definition of MSE is shown in Equation (3) in Section 2.4, and the definition of MAE is shown in Equation (4):

MAE (y, \hat{y}) = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |,

(4)

3.2. Linewidths Measurement Results

Multiple measurements using Focus & TSOM-CNN are made for the gold lines target with the linewidths range of 247–1010 nm. The measuring performance of Focus & TSOM-CNN and other two deep-learning TSOM methods (DenseNet121 model and ResNet50 model in regression) on the test set is shown in Figure 11, where the abscissa of the scatterplot is for measuring linewidth, the vertical represents the absolute value of the error of the prediction results (

| y_{i} - {\hat{y}}_{i} |

, the absolute value of the difference between the measured value and true value), and each point in the figure represents a measurement result. The DenseNet121 model and ResNet50 model are trained with the same hyperparameters reported in the Reference [19].

As can be perceived from Figure 11, the measuring error of Focus & TSOM-CNN is generally lower than that of the other two regression models, and most of the test errors are less than 15 nm with only a few exceptions, which is potentially related to the image noise from data collection.

Figure 12 shows a couple of examples of the experiment in order to evaluate the results more intuitively: the multiple measurements of gold lines with 247 nm, 357 nm, 625 nm and 768 nm linewidths. It could be concluded from these figures that the measurement output values are clustered to the true values. The experiment shows that the two-input deep-learning TSOM method proposed in this paper is accurate in nanoscale measurement and has good repeatability.

Table 1 presents the MSE and MAE of the measuring results of 247–1010 nm gold lines of the three regression models on the test set. According to the data in Table 1, the MSE of our two-input deep-learning TSOM method is 5.18 nm² and the MAE is 1.67 nm, which are both far less than other two regressive deep-learning TSOM models. This suggests that the Focus & TSOM-CNN shows the highest accuracy in nanoscale measurement.

The two-input deep-learning TSOM method is not limited by the feature extraction algorithm and the classification categories for extracting the features automatically and using the regression model. Moreover, this method is not needed to carry out complex simulation modeling process such as the library-matching TSOM method. Therefore, the method has good performance in precision and convenience of measurement.

3.3. Uncertainties of the Model

In this section, model uncertainties are evaluated by repeated training. In the experiment, ten models are obtained through training. The MSE and RMSE of the obtained models on the test set are shown in Figure 13. RMSE is the arithmetic square root of MSE, which definition is shown in Equation (5).

RMSE (y, \hat{y}) = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}},

(5)

The standard deviation of the mean of RMSE (Equation (6)) is normally used to evaluate uncertainties of the model. The result of model uncertainties is shown in Table 2.

μ_{2} = \sqrt{\frac{\sum_{i = 1}^{n} {({RMSE}_{i} - \bar{{RMSE}_{i}})}^{2}}{n (n - 1)}},

(6)

Many factors effect uncertainties in the deep-learning TSOM measurement, including the model, the parameters of the imaging system, the truth uncertainty, the imaging noise, etc. In practical experiments, some methods can be used to reduce the uncertainties of measurement, such as imaging system optimization, image denoising, etc. Reducing uncertainties would be one of the improvements of TSOM in the future.

4. Discussion

TSOM method has gradually become one of the recommended measuring methods in the semiconductor industry due to its accurate and non-destructive advantages. However, in practical application, the TSOM method will be limited by some experimental conditions. For example, the measuring nano-lines cannot locate in the middle of the images in practice, or the obtained focusing position is not accurate, etc. These conditions will greatly affect the measuring accuracy of TSOM. In this section, some of the effects will be discussed.

4.1. Influence of the Shift of the Gold Lines

In practical application, the gold line cannot always be placed in the middle part of the focused image. The applicability of Focus & TSOM-CNN when the position of the gold line on the image is shifted should be proved. In the experiment, two test sets with shifted gold lines in the images are collected and used to test the model, each set contains 7400 gold line images in different shift modes (as shown in Figure 14).

Table 3 shows the test results of Focus & TSOM-CNN on two shifted gold line sets and a no shift set, the latter can be considered as the standard case. Compared with the standard case where all gold lines are in the middle, the MSE and MAE of the two shifted sets has increased, but the MSE is still within 10 nm² and the MAE is still within 3 nm, indicating that the model has good applicability to the case that nano-lines are not located in the middle of the input images.

It must be pointed out that the result is largely due to the translation invariance of CNN. If the gold lines make more complex change rather than simply shift in the input images, Focus & TSOM-CNN will no longer be applicable. To solve this problem, it is necessary to expand the datasets or use methods of rotating target detection in the network, which is one of the future research directions.

4.2. Influence of the Optimal Focusing Position

In the actual experiment, there may be some deviations in the focusing evaluation of the image series, which is caused by experimental noise in the collected image and phenomenon of double peaks in the curve of focusing evaluation function (Section 2.2.1). In this section, experiments will prove that such deviations in the focusing evaluation cause very small errors in the measurement results. That means when there is a small offset in the position of the input focused image during the training and testing of the model, the two-input deep-learning TSOM method can still obtain accurate measurement results.

In the experiment, defocused images with defocus distances of ±200 nm, ±400 nm and ±600 nm are selected to replace the focused images as an input of Focus & TSOM-CNN in the test set. The testing results are shown in Figure 15.

As can be seen from Figure 15, when defocusing images of ±200 nm, ±400 nm and ±600 nm are selected to replace the focused images for testing, the MSE of the measurement model can remain below 10 nm², the MAE remains below 3 nm, which shows that the accuracy of the method is relatively stable. The result shows that the two-input deep-learning TSOM method proposed in this paper allows certain errors in focusing evaluation and has good applicability in actual experiments.

5. Conclusions

A TSOM method based on two-input deep-learning for nanoscale measuring is proposed in the paper. The TSOM images and the focused images are taken as the two inputs of the network and processed according to their own characteristics. In the method, the collected image information is effectively utilized so that the measurement accuracy is effectively improved. In the experiment of gold lines measuring with a linewidths range of 247–1010 nm, the MSE of our method can be reduced to 5.18 nm² and the MAE can be reduced to 1.67 nm, which is much lower than other deep-learning TSOM methods previously proposed. The two-input deep-learning TSOM method proposed in this paper is not limited by the feature extraction algorithm or the classification categories and it avoids the complex and lengthy simulation modeling process such as the library-matching TSOM method. The method shows good performance in both accuracy and convenience in the nanoscale measurement. In the future, we will further improve the accuracy of TSOM by optimizing the TSOM imaging system and the measuring network to promote the application in practice of TSOM.

Author Contributions

Conceptualization, Z.Z. and J.R.; methodology, Z.Z.; software, Z.Z.; validation, Z.Z. and J.R.; formal analysis, Z.Z., J.R. and R.P.; investigation, Z.Z., J.R. and Y.Q.; resources, Z.Z., R.P. and Y.Q.; data curation, Z.Z.; writing—original draft preparation, Z.Z. and J.R.; writing—review and editing, Z.Z.; visualization, Z.Z. and Y.Q.; supervision, J.R. and Y.Q.; project administration, Y.Q.; funding acquisition, J.R., R.P. and Y.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (NSFC), grant number 52175492.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the corresponding author, Z.Z., upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bunday, B.D.; Bishop, M. Determination of optimal parameters for CD-SEM measurement of line-edge roughness. Proc. SPIE 2004, 5375, 515–533. [Google Scholar]
Eberle, A.L.; Mikula, S. High-resolution, high-throughput imaging with a multibeam scanning electron microscope. J. Microsc. 2015, 259, 114–120. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bao, T.; Mininni, L. Improving sidewall profile metrology with enhanced 3D-AFM. Proc. SPIE 2008, 7140, 71400H. [Google Scholar]
Bruning, J.H. Optical lithography: 40 years and holding. SPIE 2007, 6520, 62–74. [Google Scholar]
Attota, R.; Silver, R.M. Optical illumination and critical dimension analysis using the through-focus focus metric method. Proc. SPIE 2006, 6289, 62890Q. [Google Scholar]
Attota, R.; Silver, R. Linewidth measurement technique using through-focus optical images. Appl. Opt. 2008, 47, 495–503. [Google Scholar] [CrossRef]
Attota, R.; Silver, R. Optical through-focus technique that differentiates small changes in line width, line height, and sidewall angle for CD, overlay, and defect metrology applications. Proc. SPIE 2008, 6922, 69220E. [Google Scholar]
Attota, R.; Germer, T.A. Through-focus scanning-optical-microscope imaging method for nanoscale dimensional analysis. Opt. Lett. 2008, 33, 1990–1992. [Google Scholar] [CrossRef]
Attota, R.; Dixson, R.G. Through-focus scanning optical microscopy. Proc. SPIE 2011, 8036, 200–208. [Google Scholar]
Attota, R.; Jindal, V. Inspecting mask defects with through-focus scanning optical microscopy. SPIE Newsroom 2013, 4964. [Google Scholar] [CrossRef]
Damazo, B.; Attota, R. Nanoparticle size and shape evaluation using the TSOM method. Proc. SPIE 2012, 8324, 989–994. [Google Scholar]
Attota, R.; Dixson, R.G. Resolving three-dimensional shape of sub-50 nm wide lines with nanometer-scale sensitivity using conventional optical microscopes. Appl. Phys. Lett. 2014, 105, 043101. [Google Scholar] [CrossRef] [Green Version]
Attota, R. Through-focus scanning optical microscopy applications. Proc. SPIE 2018, 10677, 106770R. [Google Scholar]
Attota, R.; Stocker, M. Through-focus scanning and scatterfield optical methods for advanced overlay target analysis. Proc. SPIE 2009, 7272, 353–365. [Google Scholar]
Kang, H.; Attota, R. A method to determine the number of nanoparticles in a cluster using conventional optical microscopes. Appl. Phys. Lett. 2015, 107, 103106. [Google Scholar] [CrossRef]
Attota, R.; Kavuri, P.P. Nanoparticle size determination using optical microscopes. Appl. Phys. Lett. 2014, 105, 163105. [Google Scholar] [CrossRef] [Green Version]
Qu, Y.; Hao, J. Machine-learning models for analyzing TSOM images of nanostructures. Opt. Express 2019, 27, 33978–33998. [Google Scholar] [CrossRef]
Cho, J.H.; Choi, H.C. Method of Acquiring TSOM Image and Method of Examining Semiconductor Device. U.S. Patent 20170301079A1, 19 October 2017. [Google Scholar]
Nie, H.; Peng, R. A through-focus scanning optical microscopy dimensional measurement method based on deep-learning classification model. J. Microsc. 2021, 283, 117–126. [Google Scholar] [CrossRef]
Karpathy, A.; Toderici, G. Large-scale video classification with convolutional neural networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1725–1732. [Google Scholar]
He, K.; Zhang, X. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Tng, S.S.; Le, N.Q.K. Improved Prediction Model of Protein Lysine Crotonylation Sites Using Bidirectional Recurrent Neural Networks. J. Proteome Res. 2021, 21, 265–273. [Google Scholar] [CrossRef]
Nguyen, B.P. Prediction of FMN binding sites in electron transport chains based on 2-D CNN and PSSM Profiles. IEEE-ACM Trans. Comput. Biol. Bioinform. 2021, 18, 2189–2197. [Google Scholar]
Wang, F.; Wang, H. Learning from simulation: An end-to-end deep-learning approach for computational ghost imaging. Opt. Express 2019, 27, 25560–25572. [Google Scholar] [CrossRef] [PubMed]
Joo, J.Y.; Lee, J.H. Defect height estimation via model-less TSOM under optical resolution. Opt. Express 2021, 29, 27508–27520. [Google Scholar] [CrossRef] [PubMed]
Sun, Y.; Duthaler, S. Autofocusing in computer microscopy: Selecting the optimal focus algorithm. Microsc. Res. Tech. 2004, 65, 139–149. [Google Scholar] [CrossRef] [PubMed]
Attota, R.; Silver, R.M. Evaluation of new in-chip and arrayed line overlay target designs. Proc. SPIE 2004, 5375, 395–402. [Google Scholar]
Zhang, Y.; Zhou, D. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 589–597. [Google Scholar]
Krizhevsky, A.; Sutskever, I. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]

Figure 1. Measurement process of the two-input deep-learning TSOM method.

Figure 2. TSOM imaging system: (a) Schematic and (b) Photograph of the system. LS: LED light source (520 nm wavelength); OL1: objective lens (5×); AD: aperture diaphragm (diameter = 1 mm); L1: lens (focal length = 100 mm); P: polarizer; FD: field diaphragm; L2: lens (focal length = 150 mm); L3: lens (focal length = 200 mm); BS: beam-splitting prism; PZ: piezoelectric objective locator; OL2: objective lens (50×); Collection numerical aperture (CNA) of the objective lens OL2 is 0.55. Since AD coincides with the rear focal plane of L1, the light spot is magnified by 1.5 times after passing L1 and L2; therefore, the size of the light spot on the rear focal plane of the objective lens is

Dspot = 1.5 mm

. Illumination numerical aperture (INA) is calculated as

INA = \sin \frac{θ}{2} = \tan \frac{θ}{2} = \frac{Dspot}{{2 \times f}_{OL 2}} = 0.1875

, in which

θ

is the angle of illumination and

f_{OL 2}

is the focal length of OL2.

Figure 2. TSOM imaging system: (a) Schematic and (b) Photograph of the system. LS: LED light source (520 nm wavelength); OL1: objective lens (5×); AD: aperture diaphragm (diameter = 1 mm); L1: lens (focal length = 100 mm); P: polarizer; FD: field diaphragm; L2: lens (focal length = 150 mm); L3: lens (focal length = 200 mm); BS: beam-splitting prism; PZ: piezoelectric objective locator; OL2: objective lens (50×); Collection numerical aperture (CNA) of the objective lens OL2 is 0.55. Since AD coincides with the rear focal plane of L1, the light spot is magnified by 1.5 times after passing L1 and L2; therefore, the size of the light spot on the rear focal plane of the objective lens is

Dspot = 1.5 mm

. Illumination numerical aperture (INA) is calculated as

INA = \sin \frac{θ}{2} = \tan \frac{θ}{2} = \frac{Dspot}{{2 \times f}_{OL 2}} = 0.1875

, in which

θ

is the angle of illumination and

f_{OL 2}

is the focal length of OL2.

Figure 3. Model of the gold line to be measured.

Figure 4. The TSOM cube consists of a series of images collected by through-focus scanning, as shown by the yellow cube. The optical axis is along the z direction of the coordinate system and the imaging plane is parallel to the xoy plane while scanning.

Figure 5. Captured images with different defocus distances: (a) the defocus distance is 0, (b) the defocus distance is 4 μm and (c) the defocus distance is 8 μm.

Figure 6. Evaluate the definition of the images and select the best focused image.

Figure 7. Focusing evaluation function curves of image series collected by lines with different linewidths. (a) Linewidth of 247 nm, (b) linewidth of 298 nm, (c) linewidth of 625 nm and (d) linewidth of 768 nm.

Figure 8. Generation of TSOM image. (a) The transverse sectional view in the middle of the cube; (b) Spliced sectional view; and (c) TSOM image obtained after smoothing, interpolation and pseudo-color processing.

Figure 9. Each TSOM image with a size of 89 × 89 pixels.

Figure 10. The structure of the two-input CNN for TSOM measurement (Focus & TSOM-CNN).

Figure 11. The measuring performance of three different TSOM methods on the test set. (a) Focus & TSOM-CNN; (b) ResNet50; (c) DenseNet121.

Figure 12. Four measuring examples on the test set. (a) Linewidth of 247 nm. (b) linewidth of 357 nm. (c) linewidth of 625 nm and (d) linewidth of 768 nm.

Figure 13. The MSE and RMAE of Focus & TSOM-CNN in multiple tests. (a) MSE and (b) RMAE.

Figure 14. Different shift of gold lines in the images. (a) No shift; (b) Shift to the left; (c) Shift to the right.

Figure 15. The MSE and MAE of Focus & TSOM-CNN testing by different defocus instance images. (a) MSE and (b) MAE.

Table 1. The performance summary of other TSOM methods and Focus & TSOM-CNN.

	ResNet50	DenseNet121	Focus & TSOM-CNN
MSE/nm²	258.50	176.19	5.18
MAE/nm	5.26	4.61	1.67

Table 2. The uncertainties of our model.

The Evaluation Index	Value
Standard deviation of the average RMSE, nm	0.03

Table 3. The MSE and MAE of Focus & TSOM-CNN on two shifted gold line test sets and a no shift test set.

The Test Set	Shifted Set 1	Shifted Set 2	No Shift Set
MSE/nm²	7.24	6.97	5.18
MAE/nm	2.13	2.07	1.67

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Z.; Ren, J.; Peng, R.; Qu, Y. Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring. Appl. Sci. 2022, 12, 3430. https://doi.org/10.3390/app12073430

AMA Style

Zhang Z, Ren J, Peng R, Qu Y. Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring. Applied Sciences. 2022; 12(7):3430. https://doi.org/10.3390/app12073430

Chicago/Turabian Style

Zhang, Zhange, Jiajun Ren, Renju Peng, and Yufu Qu. 2022. "Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring" Applied Sciences 12, no. 7: 3430. https://doi.org/10.3390/app12073430

APA Style

Zhang, Z., Ren, J., Peng, R., & Qu, Y. (2022). Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring. Applied Sciences, 12(7), 3430. https://doi.org/10.3390/app12073430

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Focused and TSOM Images Two-Input Deep-Learning Method for Through-Focus Scanning Measuring

Abstract

1. Introduction

2. Materials and Methods

2.1. TSOM Imaging System

2.2. The Dataset

2.2.1. Focused Image

2.2.2. TSOM Image

2.3. The Structure of the Two-Input CNN

2.4. Training

3. Experiment

3.1. Evaluation Indicators

3.2. Linewidths Measurement Results

3.3. Uncertainties of the Model

4. Discussion

4.1. Influence of the Shift of the Gold Lines

4.2. Influence of the Optimal Focusing Position

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI