1. Introduction
Retinal scan (“RS”) glasses are near-eye displays that use laser beam scanning for retinal image projection. A collimated light beam is projected via an optical combiner and through the human pupil onto the retina [
1]. Recent advances in holographic manufacturing [
2] have enabled the use of holographic optical elements (“HOEs”) as transparent, diffractive optical combiners [
3]. This allows for lightweight augmented reality displays to be integrated into all-day wearable smart glasses with a small formfactor.
Research and development in the field is progressing quickly [
4]. First efforts have been made to define new parameters like the ambient contrast ratio (“ACR”) [
5] that are necessary to characterize the performance of holographic augmented reality displays [
6] in real visual environments. We elaborate on these efforts and show that for the special case of RS displays, interactions with the human eye must be considered.
Most conventional display technologies such as LCD, LED, or waveguides emit light from each pixel into a large solid angle. Their emission profiles roughly obey Lambert’s cosine law to keep luminance constant across different viewing angles. We use the term “Lambertian” to describe such displays. From Lambertian displays, only a small portion of the emitted light enters the human eye at any given time. In contrast, all light emitted by RS displays is confined to a narrow scanning beam which passes through the pupil. This makes RS displays not only energy efficient but also leads to important consequences for fundamental display parameters like the brightness and resolution.
One challenge is that some conventional measurement devices such as luminance cameras cannot be used directly with RS glasses due to the interaction of the projection system with the human pupil. Instead, the luminance of an RS display must be computed in a certain way to achieve accurate results. We introduce a theoretically derived formula to calculate the retinal scan luminance and ambient contrast ratio. This formula incorporates various parameters including optical powers, wavelengths, field of view, and human pupil diameter. Building on this formula, we designed and conducted a psychophysical study to measure the perceived resolution of RS glasses in augmented reality settings.
While contrast and resolution perception have been investigated in many previous studies as fundamental parameters of human visual perception [
7], augmented reality glasses provide new opportunities to take the measurements of these parameters outside of controlled laboratory environments. This work presents novel theory and methods for such out-of-laboratory measurements.
2. Materials and Methods
2.1. Retinal Scan Glasses
The BML500P RS glasses prototype available for this study (
Figure 1) was developed by Bosch Sensortec and presented at the Consumer Electronics Show 2020 [
8].
A light engine in the right temple of the glasses projects images via a transparent holographic optical element through the pupil onto the retina. Two oscillating micromirrors steer light from three low-power RGB laser diodes in a raster scanning pattern to form an image at a refresh rate of 60 Hz.
2.1.1. Brightness in Retinal Raster Scanning
The human eye detects retinal illuminance as brightness. Retinal illuminance
is defined as luminous flux
per retinal area
.
Display brightness is usually denoted in terms of luminance L, which is defined as the luminous flux
emitted into a solid angle
by a light source with area
.
Luminance is a measure for the light source, while the perceived brightness depends on the illuminance on the retina. When comparing Lambertian displays with each other, luminance can be used as a proxy for perceived brightness because luminance changes lead to proportional retinal illuminance changes.
This is, however, not valid for a comparison between Lambertian and RS displays. The size of the human pupil influences the luminous flux
from a Lambertian display but not from an RS display, as shown in
Figure 2.
For this reason, measuring the luminance of an RS image with a luminance camera leads to inconsistent results: the larger the entrance pupil of the luminance camera (to which it is calibrated using a Lambertian light source), the smaller the measured RS luminance.
In the first part of the Results Section, we show that it is nevertheless possible to calculate an equivalent luminance for RS displays. This luminance depends on the RS luminous flux, RS field of view, and human pupil size and is directly comparable with the luminance of Lambertian displays.
2.1.2. Focus and Resolution in Retinal Raster Scanning
Figure 3 shows the impact that ocular imperfections have on Lambertian and RS images. The circle of confusion for individual pixels is smaller for RS images due to the strong collimation and long depth of field of the scanning light beam.
While this effect can be illustrated clearly, it is difficult to assess the differential impact on display performance with methods from conventional display metrology since the human eye needs to be brought “into the loop”.
2.2. Perceived Resolution Study Design
To measure the perceived resolution of the BML500P RS glasses prototype, we conducted a psychophysical discrimination test with 20 participants (17 males, 3 females; aged 18–60 years, mean ± sd: 36.0 ± 7.7 years). We implemented a prescreening routine for visual acuity and contrast sensitivity with Landolt rings and a test for color vision with color plates to ensure a minimum standard level of vision ability among all participants. Participants who could not pass these screening tests were not included in the study. Inclusion criteria were right-eye dominancy, a right monocular visual acuity of at least 0.8 decimal (screening results: logMAR mean ± sd: −0.073 ± 0.060), a right monocular contrast vision threshold below 15% Weber contrast (mean ± sd: 4.2 ± 0.8%), no color vision deficiency, and no known eye diseases. The experiment was approved by the ethics committee of the Medical Faculty of the University of Tübingen (Code 410/2022B02) and conducted according to the provisions of the Declaration of Helsinki. Informed consent was obtained from all participants.
The study setup is shown in
Figure 4. Participants were equipped with the RS glasses on a chinrest 2.3 m in front of a 60-inch LED monitor (PN-R603; Sharp Corporation; Sakai, Osaka, Japan). The light-isolated examination room was illuminated at an average of 2500 lux, equal to outdoor illumination on a cloudy day.
Participants were shown a right-eye monocular stimulus in the glasses with a size of 11.15 ± 0.22 deg × 14.99 ± 0.17 deg [mean ± sd from 10 measurements]. The transparent RS glasses superimposed this virtual image onto the real image from the LED monitor. Psychopy [
9] was used to control the monitor and RS glasses stimuli simultaneously. To minimize distractions, participants wore a black eye patch over their left eye.
The glasses stimulus was picked randomly from a selection of monochromatic line patterns with different spatial frequencies from 1.8 cyc/deg to 16.0 cyc/deg and a full monochromatic rectangle. The line pattern consisted of lines with equal digital pixel width for both projection and non-illuminated dark background lines. The spatial frequency range of the lines was determined by the 480 × 320 pixels that could be set within the software configuration of the prototype. The pattern with the highest spatial frequency had a width of one digital pixel.
For high digital spatial frequencies, the resulting optical lines started to blur into a full rectangle. The task of the participants was to report whether they saw lines or a rectangle using a wireless keyboard. The spatial frequency of the lines was adjusted based on the participants’ responses using a 3-down 1-up staircase algorithm to determine the highest resolvable spatial frequency. The algorithm terminated after three reversals. There was no time limit for responses. To ensure that a staircase level could not be passed by only identifying rectangles, a line pattern was always presented after two correct responses. A 0.5 s pause between stimuli was added to avoid afterimages and to reduce cues from stimulus transitions.
We performed 15 trials each for horizontal and vertical lines for different stimulus colors (laser wavelengths red [640 nm], green [521 nm], blue [452 nm]), stimulus brightness (maximum radiant flux red [272 nW]), green [736 nW], blue [1263 nW], equiluminant radiant flux green [66 nW], blue [1145 nW]), and LED monitor backgrounds [black, red, green, blue, orange] to measure the influence of different stimulus and background conditions on the perceived resolution.
The LED monitor was set to a luminance of 27 cd/m2 for each color and 1 cd/m2 for the black background. The right spectacle lens with the holographic optical combiner had an optical transmittance of = 0.90.
The optical power of the retinal scan glasses was measured by setting an optical power meter (PM160 power meter; Thorlabs, Inc.; Newton, NJ, USA) in the beam path such that all light contributing to the image was captured by the optical sensor. The wavelengths of the retinal scan glasses as well as the luminance of the LED monitor were measured with a spectroradiometer (JETI spectraval 1511; JETI Technische Instrumente GmbH; Jena, Germany). The wavelength measurement only depends on the relative magnitudes of optical power. Therefore, measuring the wavelengths of retinal scan glasses with a spectroradiometer will provide accurate results, while measuring the luminance directly will not.
Generally, the perceived resolution threshold not only depends on the spatial frequency but also on the contrast [
10]. If the contrast between stimuli and background is too small, the perceived resolution threshold decreases across spatial frequencies [
11]. Previous research showed that this influence is persistent up to contrast levels of about 0.5 [
12]. Since we wanted to avoid the influence of contrast on the perceived resolution in this study, we needed to ensure that the contrast between the retinal scan stimuli and the Lambertian background stayed above 0.5. As shown in
Section 2.1.1, however, this contrast can only be calculated correctly if the brightness for both display types is denoted in the same physical unit. We therefore first derive a formula for this calculation in the first part of the Results Section before presenting the results of the study in the second part of the section.
4. Discussion
RS glasses use an optical architecture that is different from conventional display technologies. We explained how the human pupil influences the perceived brightness of an RS display and presented a physical derivation for the RS luminance . Since the pupil adapts to ambient luminance, RS glasses appear darker in dark environments and brighter in bright environments. For an observer with standard pupil dynamics, the RS display luminance multiplies fivefold from mesopic to bright photopic environments.
For conventional displays, Troland is commonly used to describe retinal illuminance as a function of pupil size [
18]. Although Troland accounts for the pupil area
in Equation (10), one still needs to make an assumption about the eye’s focal distance
(i.e., use a value for a standard eye) to calculate an actual Troland value [
13]. In our derivation, however,
cancels out in Equation (11) after setting
equal to
. This means that
is independent from any assumptions about the eye apart from the pupil size.
We picked the model from Stanley and Davies [
17] to express pupil size as a function of ambient luminance. Using another model with additional parameters such as age [
16] could further increase the accuracy but would also increase the complexity. The fundamental relationship between pupil size and RS luminance holds independently from this model choice.
Using the ambient luminance
and combiner transmittance
, we defined the ambient contrast ratio for RS glasses as
. In some previous research [
4,
5], the ambient contrast ratio has been defined as
, where
and
are the luminance of on and off pixels,
is the see-through transmittance, and
is the ambient luminance. RS displays do not project light into dark pixel areas.
is always zero, so the denominator of the two formulas does not differ. More thought must be put into the ambient light term
in the numerator. If the term is included, the ambient contrast ratio would equal 1 for an off display. This seems in conflict with the conventional notion of a contrast ratio. We therefore propose to make a distinction between a standard “ambient contrast ratio” that does not include the background term
in the numerator, and a second quantity, potentially called the “additive ambient contrast ratio”, that does include it. In this paper, we use the standard ambient contrast ratio.
With the derivation for the brightness and contrast of visual stimuli, we were finally able to design and evaluate our psychophysical study for the perceived resolution. Overall, we found consistent results across 15 trials with different stimulus parameters. This study had a gamified design, but we provided no reward for correct answers. Instead, we instructed the participants to give accurate feedback according to their perception. Several participants reported after the study that they did not find any visual cues to manipulate the results even if they had wanted to, hinting toward a robust experimental design.
From the field-of-view size and the number of digital pixels, we can infer that the pixels were not perfectly quadratic. This led to different spatial frequencies for horizontal and vertical staircase levels (
Table 1). The fact that the correct response ratios in
Figure 6 still follow the trend of a common overall sigmoidal function further speaks to the validity of our measurement approach.
The 0.8 decimal (0.097 logMAR) minimum visual acuity that participants needed to achieve to be included in the study corresponds to a visual resolution of 24.0 cyc/deg [
19]. Since we found an average perceived resolution threshold of 7.2 cyc/deg across all trials, we conclude that the perceived resolution was limited by the optical resolution of the RS glasses and not the visual perception of any of the participants.
Our results indicate that the background conditions did not have a notable influence on the perceived resolution within the range of conditions we tested. This could, however, still be the case for other background parameter ranges. As expected, the RS luminance and the ambient contrast ratio, which was above 0.5 in all trials, did not influence the perceived resolution thresholds, despite varying over several orders of magnitude. These findings agree with the previous literature in which it is argued that the neural response in the visual pathway saturates, and further contrast increases do not lead to improved perceived resolution thresholds [
7]. We see our study as a first step toward more human-centered characterizations of augmented reality glasses and recognize the potential for follow-up studies in additional parameter ranges. In particular, the RS luminance and ambient contrast ratio could be decreased in a future study to values around and below 0.5 to also investigate the effect of contrast levels on the perceived resolution. Since calculating and setting the ambient contrast ratio of retinal scan glasses stimuli are only possible with the correct equation for the RS luminance, a follow-up experiment that varies stimulus contrast in addition to stimulus resolution would further illustrate the benefits of our theoretical derivation.
We neither measured nor controlled for the polarization of light from the retinal scan glasses. The influence of the linear polarization from the laser source on the visual perception of retinal scan displays could be an interesting aspect for further investigation, especially when compared with conventional display architectures.
We also measured the RS glasses’ optical display resolution according to the IEC 61947-2 standard [
20] and found a maximum value of 3.4 cyc/deg for monochromatic green images. In our psychophysical study, we found a perceived resolution threshold of 7.4 cyc/deg for monochromatic green images.
The perceived resolution threshold adds additional interpretability to the IEC 61947-2 optical resolution threshold, which only uses a modulation depth of 30% to determine the spatial frequency cutoff, independent of visual perceptibility. Knowing the number of pixels per degree that humans can distinguish in the glasses helps in taking better-guided design decisions for future prototypes; for example, for finding the necessary field-of-view and holographic combiner area size to show image content with a certain desired size and resolution.
Furthermore, in accordance with the ideas presented in
Section 2.1.2, we expect the ratio between the perceived and optical resolution to be higher for retinal scan displays than for conventional Lambertian displays. A promising next step to test this claim would be to conduct a second, comparative study with a Lambertian augmented reality display that only differs in its optical architecture and is otherwise as identical to our retinal scan glasses prototype as possible.
The BML500P prototype used in this study still had some known limitations. In
Figure 4, one can see brightness inhomogeneities, particularly in the top left corner of the green and blue stimuli. Such inhomogeneities typically arise from local variations in the diffraction efficiency of the HOE. This is a manufacturing issue that will be resolved in the future.
The second limitation was the small exit pupil of the single eyebox prototype. The RS display is only visible if the narrow light beam fully passes through the pupil. If this alignment is lost, the image is not visible anymore. Therefore, we added a 5 min fitting procedure for each participant at the beginning of our study. The nose bridge of the glasses could be quickly adjusted to fit to the pupil position of each participant (see
Figure 1, left). In the future, several approaches such as eyebox replication or eyebox steering can be implemented to remediate this issue.