1. Introduction
Enhanced generation of hot-electrons is of particular interest as these electrons can be harnessed for various applications including photodetection [
1], photochemistry [
2,
3], solar cells [
4], imaging [
5], sensing [
6,
7], hydrogen generation [
8,
9], and CO
2 reduction [
10]. Hot-electrons are electrons which are in a non-equilibrium state with energies larger than that of thermal equilibrium. While hot-electrons can be generated using different techniques such as optical pumping or heating to high temperatures, one technique that has recently attracted attention is excitation through surface plasmons (SPs) [
1,
11]. SPs are collective oscillations of electrons that can couple to incident electromagnetic fields. They exhibit both particle and wave-like behaviors, subwavelength mode volumes, and high field concentrations. After excitation, SPs can either decay radiatively or non-radiatively. Radiative decay leads to re-emission of photons whereas non-radiative decay can lead to near-field electromagnetic field enhancement [
6,
7], plasmonic heating effects [
12,
13], or generation of hot-electrons via interband or intraband transitions [
1,
11]. The non-radiative decay processes of SPs have been considered as a factor that limits the performance of plasmonic devices such as optical biosensors, nanoscale lasers, and plasmonic circuits. However, recent studies have demonstrated that such non-radiative processes can be utilized for many exciting applications [
11].
In terms of photodetection, hot-electrons generated by SPs offer a unique opportunity as they enable the realization of photodetectors that can detect light at frequencies below the semiconductor bandgap at room temperature. In traditional photodetectors, the operational wavelength is determined by the bandgap of the semiconductor and the generation of photocurrent is due to the formation of electron-hole pairs, which carry the current under an electrical bias. Alternatively, hot-electron-based photodetectors can be realized by placing a metal surface in contact with a semiconductor forming a Schottky barrier. Hot-electrons, being energetic and not in thermal equilibrium, transport to the Schottky interface. Then, photocurrent is generated from the injection of the hot-electrons, with the required momentum distribution, into the conduction band of the semiconductor through an internal photoemission process. Because this operational principle is fundamentally different, the operational wavelength of hot-electron photodetectors is instead determined by the height of the Schottky barrier and not the intrinsic bandgap of the semiconductor.
Plasmonic hot-electron photodetectors using different types of plasmonic nanostructures [
14,
15,
16,
17,
18] have been extensively studied and demonstrated experimentally. Most of these photodetectors suffer a common drawback, that is, low efficiency. The first step in hot-electron photodetection, and in general for all hot-electron generation applications, involves optical absorption in the metal nanostructure. Optimization of optical absorption is a vital step in increasing the efficiencies of such devices. Recently [
18,
19], large improvements in the efficiency of photodetection have been reported by designing hot-electron based photodetectors with optimized absorption. The enhancement of the absorption was realized either by designing metal gratings with deep trench cavities [
18] or by designing ultrathin metasurfaces with a Fabry-Perot resonance overlapped with a plasmon resonance [
19]. The latter design increased optical absorption to ~90%, which in conjunction with the subwavelength thickness of the metasurface (comparable to the electron diffusion length) and flexibility of being able to tailor the resonances by modifying the geometry of the meta-atoms, resulted in record-high photosensitivities [
19]. In this article, we concentrate on the vital step of optimizing absorption using plasmonic metasurfaces. Studies of photodetection with the generated hot-electrons using the optimized structures will be presented in a future study.
Traditional approaches for optimizing absorption start with an array of intuitive shapes such as squares, stripes, etc. to design a metasurface. In this article, we used an artificial intelligence-based approach to optimize and predict the absorption spectra of plasmonic metasurfaces fabricated on top of a semiconductor. In the first part, using a genetic algorithm, we designed a plasmonic metasurface with 90% absorption at 1550 nm. Our design does not require an optically-thick ground plane, and therefore, does not rely on multiple reflections and interference effects. The absence of an optical ground plane simplifies fabrication and ensures that all absorption occurs in the metasurface and can therefore contribute to hot-electron generation. Our metasurface design does not require overlapping of two resonances and yet it achieved a similar absorption as that in [
19], which resulted in hot-electron photodetectors with record high photosensitivities. In contrast to [
19], our absorption enhancements are narrowband, which when integrated with contacts for photodetection, will allow for wavelength specific detection without the need for bulky external optics and filters. Finally, our structure is also designed with symmetry, which supports polarization insensitivity.
While genetic algorithms provide an optimized result for a specific problem, the process is both time and computationally expensive. This, in general, is not desirable for the repeated optimization of absorption of metasurfaces at multiple wavelengths or for the creation of multiple desired absorption spectra. To circumvent this problem, in the second part, we used deep learning and developed a convolutional neural network that allows us to predict absorption spectra of plasmonic metasurfaces. Although just a proof-of-concept, the results of the absorption spectra predicted by the neural network were well-matched with the electromagnetic simulations, thereby suggesting a new direction for artificial intelligence-assisted optimization of optical absorption in plasmonic metasurfaces.
2. Optimization of Optical Absorption of a Metasurface Using a Genetic Algorithm
Recent applications of artificial intelligence have revolutionized many realms of science and engineering. Historically, artificial intelligence was applied to fields such as computer vision, speech recognition, finance, etc. Recently, there has been a surge in the application of artificial intelligence in fields of fundamental science such as material science and nanophotonics [
20,
21]. In the domain of nanophotonics, artificial intelligence has been applied to the design of nanophotonic structures either using optimization algorithms or, more recently, using machine learning techniques that employ artificial neural networks [
22]. In this article, we explore both these options for designing a metasurface optimized for enhanced hot-electron generation. In the first part of this paper, we concentrate on optimization techniques for enhancing the absorption at a single wavelength. In the second part, we use a deep learning technique to predict absorption spectra of arbitrary metasurface designs.
The main advantage of optimization algorithms over traditional physics-inspired and intuition-based methods is that their use allows the optimization of many parameters and can predict numerous non-intuitive designs that can achieve the desired optimal performance. While there are many optimization algorithms, we used the genetic algorithm, which is an evolutionary algorithm that is based on a natural selection process that mimics biological evolution [
23]. We chose the genetic algorithm because it has proven to be an effective technique to find minima for highly nonlinear optimization problems, which can be both constrained and unconstrained. Furthermore, unlike many other standard optimization techniques, the genetic algorithm is flexible as it can be used to minimize objective functions that can be discontinuous and nondifferentiable, as is our case.
Figure 1a shows a schematic of the optimized structure. The metasurface is comprised of silver nanostructures fabricated on top of a GaAs substrate, which has a band gap at ~870 nm. Silver was selected due to its inherent low plasmon damping constant as compared to gold [
24]. The lower damping constant results in the generation of more intense electromagnetic fields at the interface, which leads to larger optical absorption, and therefore, more efficient hot-electron generation [
24].
We implemented our optimization routine employing MATLAB’s built-in genetic algorithm to minimize the value of the objective function,
f =
1 − Absorption calculated from COMSOL Multiphysics’ simulations via LiveLink for MATLAB. To create polarization insensitivity, we restricted the symmetry such that the metasurface design has mirror symmetry. The metasurface was divided into an array of 14 × 14 pixels (each 40 nm in size) and could be entirely defined in the genetic algorithm by only 28 parameters, which essentially control whether each pixel is silver or GaAs. The optimized metasurface design is shown in
Figure 1b and the calculated absorption, reflection, and transmission spectra at normal incidence are shown in
Figure 1c. The corresponding electric field distribution at the interface is shown in
Figure 1d. The results shown in
Figure 1 are the product of ∼1000 model runs with each run requiring ∼60 s. With just ~1000 runs, we were able to obtain a structure with ~90% absorption at 1550 nm without requiring any optical ground plane. We note here that the absorption of our metasurface is comparable to absorption of highly optimized metasurfaces such as the ones shown in [
19]. Theoretically, a more optimal structure could be obtained with additional time, however, for this work, we restricted the genetic algorithm runtime. Our optimized metasurface is intentionally designed to be narrowband as our eventual goal is to integrate the metasurface with contacts for wavelength-specific photodetection at near-infrared wavelengths. Narrowband enhancement of absorption allows us to operate without the need for external optics and filters.
A sample was fabricated and experimentally characterized to confirm the predictions of our optimization algorithm. The metasurface was fabricated via electron beam lithography using a negative tone hydrogen silsesquioxane resist, which was exposed with a base dose of 1700 µC/cm
2. This was followed by dry etching, metal deposition, and then metal-liftoff in
N-Methyl-2-pyrrolidone. The metals deposited were titanium (3 nm) and silver (100 nm) where titanium acts as an adhesion layer.
Figure 2a,b shows scanning electron micrographs of the fabricated metasurface. To optically characterize the sample, we measured the reflectance from the metasurface using a Fourier-transform infrared spectroscopy (FTIR) microscope.
Figure 2c shows the experimental results and we clearly see a dip in reflection at around 1500 nm. We note here that the dip in reflection is less pronounced as compared to the results shown in
Figure 1c. This is primarily because we used unpolarized light for the measurement, which was focused on the sample using a microscope objective lens of NA = 0.58. The large range of incident angles, because of the high NA of the objective lens, leads to a broadening of the dip in the reflection section. We recalculated the reflection spectrum to take some of these effects into account, as shown in
Figure 2d, and observed qualitative agreement with the experimental data. This confirms the validity of our simulated response.
3. Prediction of Absorption Spectrum of Metasurfaces Using Convolutional Neural Networks
A genetic algorithm is a powerful technique for optimizing a single quantity. For example, the absorption of a plasmonic metasurface at a single wavelength. However, this is not the most efficient and practical method for the optimization of multiple quantities such as optimizing different metasurfaces designs with peak absorption at different wavelengths. Separate computations must be performed for each quantity we want to optimize. This makes the simulations iterative and recursive efforts, both of which are time consuming and computationally expensive. For example, the results we presented above took a total of 16 h of computation time. Furthermore, with an increase in the number of parameters, the computation time increases exponentially. Other efficient optimization methods such as the adjoint method exist but setting up the adjoint method for complex photonic systems can be nontrivial [
25].
Recently, deep learning algorithms that use biologically inspired artificial neural networks have attracted a lot of attention in nanophotonics [
20]. Employing these methods allows one to approach optimization problems more efficiently. Deep learning algorithms learn from a relatively large dataset such that a solution can be found almost instantaneously. These algorithms reduce the overall computation time when a common dataset is available for the group of quantities that require optimization. Additionally, these algorithms can be used simultaneously for both approximation (i.e., to predict the results of electromagnetic simulations) and for inverse design.
In this article, as a first step, we only addressed the problem of approximation and developed a convolutional neural network (CNN) to learn the relationship between a 2D metasurface design and its absorption spectrum. Our ability to do so eliminates the need for computationally expensive simulations to predict the absorption spectra of different metasurface designs. Furthermore, the same dataset generated to train these neural networks can be used to train a different set of neural networks with different optimization goals. For example, optimizing the absorption spectrum based on an application requiring broadband absorption or resonant peaks at different wavelengths.
The CNN model we developed takes a 14 × 14 pixel black-and-white metasurface design as an input and returns 11 values that represent the shape of the absorption spectrum as the output. The 11 values correspond to absorption values at different wavelengths. As shown in
Figure 3a, the CNN is comprised of a series of four convolutional layers followed by average pooling layers. These convolutional layers each have a number of filters (n: 32, 64, 64, and 128), 3 × 3 × n kernel matrices of weights that are used to process the input to the layer and produce feature maps that are passed to the subsequent layer in the CNN. The dot product is calculated between each kernel and the pixel values of the input using a sliding window to generate a feature map over the entire input. This result is composed with the rectified linear (ReLU) activation function [
26] to determine the values that are passed on to the next layer. Each kernel produces a unique feature map and the number of filters in each convolutional layer is set as a hyperparameter related to the complexity of the data. The average pooling layers downscale the input by calculating the average value of each region of the input. As shown in the
Figure 3a, the input downscales to about 1/2 the size as it gets passed on to the subsequent layer. Finally, a fully-connected layer maps the resultant vector to 11 values that represent the absorption spectrum. As the CNN learns, its weights are updated to identify features that have the highest relevance to the output absorption spectrum values. The mean squared error between the shape of the generated absorption spectrum and the shape of the true absorption spectrum is used as the loss function. This error is backpropagated through the network and guides the training.
We implemented the CNN model using the Keras deep learning framework with TensorFlow [
27] and the weights of the CNN were optimized using Adadelta [
28]. We used 70,000 training examples, including a 2D meta-surface design and corresponding absorption spectrum, and held out 30,000 examples for validation. Data generation was supplied via rigorous-coupled-wave analysis (RCWA). To achieve a balance between the accuracy of the electromagnetic simulations and computation time, we used only a few diffraction orders in the RCWA calculations. We trained the CNN for 200 epochs, which takes approximately 20 min on a 32 GB GPU. The model converged at much fewer number of epochs as can be seen from the training and validation losses plotted in
Figure 3b,c. Finally, once the CNN was trained, we used the model to predict an absorption spectrum given a new, 2D metasurface design that the CNN had never seen before. The model predicted an absorption spectrum within seconds.
Figure 4 shows two examples that compare a predicted and simulated spectrum. We observed very good agreement, thus confirming that our CNN model is successful in approximating and predicting the absorption spectrum of arbitrary metasurface designs.
4. Conclusions
In summary, we demonstrated the use of an artificial intelligence-based approach to optimize absorption spectra of plasmonic metasurfaces. Our metasurface is comprised of silver nanostructures on a GaAs substrate. In comparison to previous physics and intuition-based approaches, which rely on intuitive shapes such as cubes and stripes to optimize absorption for enhancing hot-electron generation, the artificial intelligence-based approach predicts non-intuitive designs. Our optimized metasurface design has an absorption of ~90% at 1550 nm and does not require a ground plane to achieve such high absorption. This simplifies fabrication and ensures more efficient generation of hot-electrons. The optimized metasurface geometry has been constrained to have mirror symmetry for polarization insensitivity and is also designed to be narrowband for wavelength selectivity without the need for additional optics or filters.
In addition to optimization, we also demonstrated the use of a deep learning approach to predict and approximate absorption spectra of plasmonic metasurfaces. The absorption spectra predicted by the neural network were in good agreement with electromagnetic simulations. Although we only demonstrated prediction using our machine learning model, in principle, the same dataset generated to train these neural networks can be used to train an inverse neural network model, which will allow us to leverage inverse design metasurfaces with different absorption spectra required for different applications. While this article only takes the metasurface component into consideration, the deep learning model can be expanded to include additional parameters such as the electronic properties of the materials and hot-electron diffusion lengths. This will allow for a more comprehensive optimization of the hot-electron generation process and suggests a new direction for enhancing hot-electron generation for various applications ranging from sensing and photodetection to photochemistry. Finally, although our approach was demonstrated for plasmonic metasurfaces, it can also be extended to dielectric metasurfaces.