Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Wide field-of-view light-field head-mounted display for virtual reality applications

Open Access Open Access

Abstract

Light-field head-mounted displays (HMDs) can resolve vergence-accommodation conflicts but suffer from limited display pixels, causing a narrow field-of-view (FOV). This study proposes a wide-FOV light-field HMD with a 5.5-inch-diagonal 4 K display for virtual reality applications. By adjusting the pitch of elemental images to control the eye relief and creating a virtual intermediate image, horizontal and vertical FOVs of 68.8° and 43.1°, respectively, can be achieved using a monocular optical bench prototype.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Recently, virtual reality (VR) and augmented reality (AR) have become popular and widely used in various fields, such as education, medicine, and entertainment. Moreover, head-mounted displays (HMDs), such as VR and AR headsets, are rapidly becoming more compact and lightweight with higher resolutions and wider fields of view (FOVs) owing to the development of displays and optical systems [15]. Most HMDs on the market use a binocular system that shows parallax images on the left and right displays to create a three-dimensional (3D) view through binocular parallax. However, these binocular HMDs result in the vergence–accommodation conflict (VAC), which can cause discomfort and visual fatigue during prolonged use [6,7]. VAC originates from the discrepancy between the vergence distance, which is determined by the parallax image, and the accommodation distance, which is fixed to the magnified image plane of the display. Various approaches have been proposed to address VACs, including varifocal near-eye [8], multifocal plane near-eye [9,10], holographic near-eye [1114], and light-field near-eye [1519] displays. Among these, light-field near-eye displays, which reproduce the light rays emanating from an object and reaching the eye, have been actively studied because of their ease of design and feasibility. Integral imaging-based (InI-based) light-field near-eye displays use pinhole arrays [2024] or lens arrays [2529] as optical elements, and the display shows elemental images with slightly different views of a 3D image. The light-field HMD reproduces multiple rays emanating from the 3D image at different angles and reaching different locations in the pupil. This approach enables the viewer to obtain correct or nearly correct focus cues [30,31]. However, because a single point in a 3D image is reproduced by light rays from multiple pixels, numerous pixels are required to provide a wide FOV. Previous studies have proposed lightweight and compact light-field HMDs that use microdisplays to produce high-quality 3D images; however, their FOVs are limited to less than 30° horizontally and 18° vertically [25,28].

Commercial HMDs adopt microdisplays of various sizes depending on the device requirements. AR microdisplays often measure approximately 0.2 inches diagonally because of the requirements for small size, light weight, and low power consumption. In contrast, VR microdisplays often measure 1.3–2.2 inches diagonally, because they are required to be large and to have high pixel counts to display images with wide FOVs [32]. Some commercial HMDs for VR use medium-sized displays with diagonals of up to 5.5 inches to display images that cover the entire human FOV. Implementing a medium-sized display in a light-field HMD enables the use of numerous rays that can display a 3D image over a wide area. Although several studies have been conducted on light-field HMDs with medium-sized display [20,29], no studies have been reported on the relationship between the arrangement of components and device performance such as FOV, spatial frequency, and eye box for the light-field HMD comprising a simple lens array and eyepiece.

In this study, using a medium-sized display, we propose an InI-based light-field HMD that can display 3D images with a wide FOV and high spatial frequency in a practical form factor. Two methods are proposed for fabricating light-field HMDs. The first method is an elemental image arrangement to control eye relief in an arbitrary manner. This enables the spatial frequency and FOV to be set freely, thereby enabling the display of 3D images with a wide FOV and high spatial frequency. The second method involves forming an intermediate image as a virtual one. This reduces the depth of the optical system. The device performance of the light-field HMD, including FOV, spatial frequency, and eye box is evaluated using theory, optical simulations, and experiments with a monocular optical bench prototype.

2. Light-field HMD with medium-sized display for VR applications

Two methods for fabricating light-field HMDs in VR applications are described here. The first involves the arrangement of the elemental images to be shown on the display, and the second involves the formation of an intermediate image reconstructed by the elemental images and lens array.

2.1 Arrangement of elemental images

The elemental image arrangement of the proposed method is explained in comparison with that in previous studies [28]. Figure 1 shows the arrangement of the elemental images and the path of the chief rays emitted from the center of each elemental image and passing through the center of the microlens for the InI-based light-field HMD. Figure 1(a) shows a conventional light-field HMD with a display, lens array, and eyepiece. The relationship between pEI and pa, the pitches of the elemental images and lens array, respectively, can be expressed as follows:

$${p_{\textrm{EI}}} = {p_\textrm{a}}.$$

 figure: Fig. 1.

Fig. 1. Schematic of the arrangement of the elemental images and the path of the chief rays emitted from the center of each elemental image. (a) Conventional arrangement of light-field HMD. (b) Proposed arrangement of light-field HMD.

Download Full Size | PDF

Because of this equality, the chief rays travel as parallel rays, regardless of the position of the display, and are refracted by the eyepiece to reach the eye. This implies that the elemental image is directly behind the microlens and that the chief rays constantly enter the microlens along the on-axis direction. However, this configuration requires the eye relief, which is the distance between the eyepiece and the pupil plane, to be equal to the focal length of the eyepiece. That is, using an eyepiece with a long focal length results in greater eye relief. The focal length of the eyepiece is related to the magnification of the 3D image, which determines the spatial frequency and FOV. In the conventional method, adjusting the spatial frequency and FOV is difficult because the focal length of the eyepiece simultaneously affects the spatial frequency, FOV, and eye relief.

Figure 1(b) shows a schematic diagram of the proposed light-field HMD. pEI slightly exceeds pa and is given by the following equation:

$${p_{\textrm{EI}}} = {p_\textrm{a}}\left[ {1 + \frac{{{d_\textrm{a}}({{f_{\textrm{ep}}} - {d_{\textrm{er}}}} )}}{{{d_{\textrm{aep}}}{f_{\textrm{ep}}} - {d_{\textrm{er}}}({{d_{\textrm{aep}}} - {f_{\textrm{ep}}}} )}}} \right],$$
where da is the distance between the display and lens array, daep is the distance between the lens array and eyepiece, der is the eye relief, and fep is the focal length of the eyepiece. The chief ray from the center of each elemental image enters the microlens at various off-axis angles, is further refracted by the eyepiece, and is focused on a point on the pupil plane. In the proposed method, the eye relief can be freely adjusted by changing pEI, as shown in Eq. (2). Therefore, the focal length of the eyepiece can be freely selected without affecting the eye relief. In particular, for medium-sized displays with relatively large pixel pitch, eyepieces with long focal lengths must be used to suppress the reduction in spatial frequency. The proposed method, which can freely control the spatial frequency and FOV while keeping the eye relief fixed, is effective for achieving a wide FOV and high spatial frequency in light-field HMDs using medium-sized displays.

2.2 Intermediate image formation

Depending on the arrangement of the display and lens array, integral imaging technologies can be classified into depth-priority integral imaging (DPII) and resolution-priority integral imaging (RPII) [33,34]. In DPII, the display and lens array are separated by the focal length of the lens array, and the light emitted from the display travels as collimated light when passing through the microlens. Although the resolution is limited by the number of microlenses, 3D images with large depths can be displayed. In RPII, the display and lens array are placed closer or farther than the focal length of the lens array, and the light emitted from the display travels as divergent or convergent light. RPII can display 3D images at shallow depths and high resolutions. Previous studies on InI-based light-field HMDs have generally adopted RPII [2729]. In this section, we outline the benefits of our proposed method by contrasting it with previous light-field HMD studies that used RPII.

Figure 2(a) shows a schematic of the path of the rays from the point source of the display, as proposed in a previous study [28]. The distance separating the display and lens array is set larger than the focal length of the lens array to ensure that the light rays passing through the microlenses are focused on a single point on the intermediate image plane. The relationship between the display, lens array, and intermediate image plane is expressed as follows:

$$\frac{1}{{{d_\textrm{a}}}} + \frac{1}{{{d_\textrm{i}}}} = \frac{1}{{{f_\textrm{a}}}},$$
where di is the distance between the lens array and the intermediate image plane and fa is the focal length of the lens array. The light rays that pass through the intermediate image plane and are refracted by the eyepiece reach the pupil plane as divergent light. Here, the divergent light is considered to have originated from a virtual image plane, which is the intermediate image plane magnified by the eyepiece. The relationship between the intermediate image plane, eyepiece, and virtual image plane is expressed by the following equation:
$$\frac{1}{{{d_{\textrm{ep}}}}} - \frac{1}{{{d_\textrm{v}}}} = \frac{1}{{{f_{\textrm{ep}}}}},$$
where dep is the distance between the intermediate image plane and the eyepiece and dv is the distance between the eyepiece and the virtual image plane. The display, intermediate image plane, and virtual image plane are optically conjugated. An intermediate image plane is formed in front of the lens array as a real image, and the eyepiece must be placed as far as dep from it. Therefore, the depth of the entire optical system, which is the distance from the display to the eyepiece, is expressed as da + daep= da + di + dep. In a previous study on light-field near-eye displays for AR, microdisplays were placed on the sides of the face, and the rays were directed to a free-form prism in front of the face. However, when designing an HMD for VR applications, the depth of the optical system must be as short as possible because the display and optical elements must be placed in front of the face.

 figure: Fig. 2.

Fig. 2. Schematic of the path of rays emitted from a point source on the display. An intermediate image is formed as a (a) real or (b) virtual image.

Download Full Size | PDF

Figure 2(b) shows a schematic of the path of the rays emitted from the point source of the display in the proposed method. The distance separating the display and the lens array is smaller than the focal length of the lens array, and the light rays passing through the microlenses proceed as divergent light. Tracing the divergent light in the opposite direction reveals that the light is focused on a single point on the intermediate image plane (orange dotted line in Fig. 2(b)). The intermediate image plane is formed as a virtual image, and the relationship between the display, lens array, and intermediate image plane is expressed as follows:

$$\frac{1}{{{d_\textrm{a}}}} - \frac{1}{{{d_\textrm{i}}}} = \frac{1}{{{f_\textrm{a}}}}.$$

In the proposed method, the intermediate image plane is located at the back of the display, and the eyepiece is positioned with respect to that position. The relationship between the intermediate image plane, eyepiece, and virtual image plane is the same as that expressed in Eq. (4). Consequently, the eyepiece could be placed closer to the lens array, thereby significantly reducing the depth of the optical system. The distance from the display to the eyepiece is expressed as da + daep= dadi + dep. Compared with the conventional method, the proposed method enables the depth of the optical system to be reduced by 2di.

2.3 Device parameters

The FOV, spatial frequency, and eye box of a light-field HMD fabricated using the methods discussed in Sections 2.1 and 2.2 are described theoretically. The FOV is approximately determined by the angle α at which the chief ray from the center of the elemental image at the edge of the display is refracted by the eyepiece and enters the eye, as shown in Fig. 1(b). The distance h from the center axis of the eyepiece to the point where the chief ray from the edge of the elemental image contacts the eyepiece, is expressed as follows:

$$h = \frac{{n - 1}}{2}\left[ {{p_\textrm{a}} - \frac{{{d_{\textrm{aep}}}}}{{{d_\textrm{a}}}}({{p_{\textrm{EI}}} - {p_\textrm{a}}} )} \right],$$
where n is the number of microlenses in one direction of the lens array and is expressed as n = wd / pEI, where wd is the display size. When the centers of the display and eyepiece are coaxial, the FOV is expressed as follows:
$$\alpha = 2\arctan \left( {\min \left[ {\frac{{{w_{\textrm{ep}}}}}{{2{d_{\textrm{er}}}}},\frac{h}{{{d_{\textrm{er}}}}}} \right]} \right),$$
where wep is the eyepiece diameter. When the center of the display and the eyepiece are not coaxial, the FOV must be calculated from their respective heights h (refer to Supplement 1).

The spatial frequency depends on the size of the pixels magnified by the lens array and the eyepiece. The size of the magnified pixel P is expressed by the lateral magnification of the microlens in the intermediate image plane (m = di / da) and that of the eyepiece in the virtual image plane (M = dv / dep), as follows:

$$P = mM{p_\textrm{d}},$$
where pd denotes the pixel pitch of the display. The spatial frequency η, defined as the number of cycles in one degree, is expressed using P as follows:
$$\eta = \frac{{{d_\textrm{v}} + {d_{\textrm{er}}}}}{{2P}}\frac{\pi }{{180}}.$$

The spatial frequency changes slightly when the eye relief is changed by the method described in Section 2.1. Equation (9) shows that the spatial frequency tends to increase slightly with increasing eye relief. The spatial frequency in the virtual image plane is explained based on the magnified pixels. In contrast, a light-field HMD forms a single point on a 3D image at various depths, including the virtual image plane, by intersecting multiple light rays emitted from different pixels. Each ray is the narrowest on the virtual image plane and widens as it moves away from the virtual image plane. Consequently, the size of a single point in a 3D image increases at depths away from the virtual image plane, which reduces the sharpness of the 3D image. A detailed discussion of this concept is provided in Section 4.3.

The eye box is the area on the pupil plane in which the 3D image appears as designed. The chief rays emitted from each elemental image and passing through the center of the microlens are designed to overlap the pupil plane according to the position of the elemental image (Fig. 3). Inside the eye box, the user can view a 3D image from a viewpoint corresponding to the position of the eye, whereas outside the eye box, light rays from neighboring elemental images pass through the microlens and enter the eye, causing crosstalk. The eye box size wEB can be expressed as follows:

$${w_{\textrm{EB}}} = {p_{\textrm{EI}}}\frac{{{d_{\textrm{aep}}}}}{{{d_\textrm{a}}}}\left( {1 - \frac{{{d_{\textrm{er}}}}}{{{f_{\textrm{ep}}}}} + \frac{{{d_{\textrm{er}}}}}{{{d_{\textrm{aep}}}}}} \right).$$

 figure: Fig. 3.

Fig. 3. Schematic of the path of rays emitted from each elemental image to the pupil plane. The green, yellow, and red rays indicate rays emanating from the top, center, and bottom of each elemental image, respectively.

Download Full Size | PDF

According to Eq. (2), when the eye relief is changed using the method described in Section 2.1, the elemental image pitch pEI increases. Therefore, according to Eqs. (2) and (10), the eye box tends to increase as the eye relief increases.

In Section 2.2, we proposed a method to form the intermediate image as a virtual image. Here, we describe the differences in the device parameters when the intermediate image is formed as a real image. In this case, the distance from the lens array to the eyepiece is expressed as daep= di + dep. By substituting this equation into Eqs. (2), (7), (9), and (10), the FOV, spatial frequency, and eye box can be obtained when the intermediate image is formed as a real image. When the magnification m remains unchanged regardless of the method of intermediate image formation, the FOV and eye box are reduced by forming the intermediate image with the real image, and the spatial frequency does not change. In other words, the proposed method of forming the intermediate image as a virtual image has the advantage of increasing the FOV and eye box and reducing the device size. However, it has the disadvantage of not enabling the incorporation of tunable relay optics, as has been achieved previously [28]. A comparison of device parameters with specific numerical values is provided in Supplement 1.

Here, we explain how to determine these device parameters. The eye box should be larger than the diameter of the human pupil (2–6 mm). Furthermore, because of eye movements and individual differences in the interpupillary distance, a larger eye box is required when eye tracking is not used. As the eye box is proportional to the pitch of the lens array, the desired eye box must be determined, and the lens pitch required to achieve it must be calculated. Next, the spatial frequency and FOV are determined. The spatial frequency and FOV have a trade-off relationship, as in general binocular HMDs. Although the FOV can be increased by shortening the focal length of the eyepiece, the spatial frequency also decreases. However, using the proposed method, the FOV can be increased while maintaining the spatial frequency as long as the display size increases.

3. Implementation

A monocular optical bench prototype was developed to confirm that a wide-FOV 3D image can be obtained using the proposed method with a medium-sized display (Fig. 4(a)). A 5.5 inch active-matrix liquid crystal display (Sharp LS055D1SX05(G)) was used to display the elemental images. The display has an active area of 120.96 × 86.04 mm, resolution of 3840 × 2160 pixels (806 pixels per inch), and refresh rate of 30 Hz. The lens array consisted of an orthogonal array of square microlenses with a pitch of 1.0 mm and a focal length of 15.0 mm. The eyepiece was a 63-mm-diameter Fresnel lens with a focal length of 100.0 mm. A Sony α7s III was used as a camera and placed in front of the eyepiece to ensure that the principal plane of the camera lens was in the pupil plane. The distance between the display and lens array was set to da = 9.34 mm, the distance between the lens array and eyepiece was set to daep = 66.2 mm, the distance between the eyepiece and the virtual image plane was set to dv = 1000 mm, and the eye relief was set to der = 37 mm (Fig. 4(b)). As the lateral magnification of the lens array was set to 2.65, the distance between the lens array and the intermediate image plane was calculated as di = 9.34 × 2.65 = 24.8 mm. According to the equations in Section 2.3, the device size from the display to the eyepiece was 75.5 mm, the pitch of the elemental image was pEI = 1.07 mm, the monocular FOV was αH = 70.8° horizontally and αV = 43.0° vertically, the spatial frequency was η = 9.9 cycles per degree (cpd), and the eye box was wEB = 9.06 mm.

 figure: Fig. 4.

Fig. 4. (a) Perspective and (b) side views of the monocular optical bench prototype.

Download Full Size | PDF

4. Experimental results

4.1 Optical simulation

To evaluate the performance of the device under ideal conditions without aberrations, an optical simulation environment for the light-field HMD was created using the non-sequential mode of ANSYS Zemax OpticStudio. Elemental images were used as a surface light source, and a lens array and eyepiece were arranged with the same performance as that of the monocular optical bench prototype described in Section 3. The lens array and eyepiece were ideal lenses without aberrations. Additionally, to simulate a virtual eye, we positioned an ideal lens mimicking a crystalline lens and a detector emulating the retina in front of the eyepiece. The crystalline lens was placed at an eye-relief distance from the eyepiece, and the detector was placed 24 mm away from the crystalline lens. The diameter of the crystalline lens was defined as 4.5 mm, and the spatial frequency of the detector was 30 cpd. Ray tracing from the display to the detector was performed to generate a retinal image that represented the intensity distribution of the detector. The focal position of the eye could be changed by adjusting the focal length of the crystalline lens.

4.2 Field-of-view

The FOV was evaluated using a method in which a 3D image of a 2D plate with a graduated texture was displayed on a virtual image plane (1000 mm from the eyepiece) and evaluated from the area visible through the eyepiece. Figure 5(a) shows the graduated texture, where the numerical unit is meters. The 2D plate was sized according to the numerical values. Elemental images with a resolution of 3840 × 2160 pixels were generated at various eye reliefs by tracing the path of light rays from the pixels of the display to the 3D object and obtaining the color information of the 3D object (see Supplement 1). The retinal images were generated using the optical simulations described in Section 4.1. Figure 5(c) shows the simulated retinal images for different eye-relief conditions. The retinal image of the 2D plate widened as the eye relief decreased, and its shape changed from round to square. This is because as the eye relief increases, the visible range of the 3D image becomes limited by the diameter of the eyepiece, as shown in Eq. (7). The FOV was measured from the scale value of the simulated retinal image and compared with the theoretical FOV in Eq. (7). The measured FOV αm is expressed as follows:

$${\alpha _\textrm{m}} = 2\arctan \left[ {\frac{w}{{2({{d_\textrm{v}} + {d_{\textrm{er}}}} )}}} \right],$$
where w is the scale of the simulated retinal image. Figure 5(b) shows a plot of the theoretical and measured FOV for each eye relief. The solid lines indicate the theoretical FOV determined using Eq. (7), and the dots indicate the measured FOV estimated from the simulated retinal images. The results obtained from the optical simulation are close to the theoretical values. The conventional method of equating the pitch of an elemental image to that of the lens array requires the eye relief to be fixed at 100 mm, the focal length of the eyepiece. The visible area at der = 100 mm was 680 mm horizontally and vertically, and the measured FOV was calculated as 34.4° both horizontally and vertically, using Eq. (11). However, by changing only the eye relief using the same lens array and eyepiece with the proposed method, the visible area at der = 37 mm was 1440 mm horizontally and 820 mm vertically; the measured FOV was calculated as 69.5° horizontally and 43.1° vertically.

 figure: Fig. 5.

Fig. 5. (a) Graduated texture, (b) plots of simulated and measured FOVs at various eye reliefs, and (c) simulated retinal images at various eye reliefs.

Download Full Size | PDF

Experiments using a monocular optical bench prototype were conducted with an eye relief of der = 37 mm. The elemental images used in the optical simulation are shown on the display (Fig. 6(a)), and the photographs captured using a camera lens with a focal length of 24 mm are shown in Fig. 6(b). The f-number was set to 5.6, which was close to the pupil diameter. The effective aperture was calculated using the focal length of the camera lens and f-number as 24 / 5.6 = 4.3 mm. The visible area in the photograph was approximately 1420 mm horizontally and 820 mm vertically. The measured FOV calculated using Eq. (11) is 68.8° horizontally and 43.1° vertically, which are close to the theoretical and simulated values. Regarding the visibility of the retinal image, the clarity of the tick marks and numbers decreased from the center to the outside, and distortion and chromatic aberration occurred at the periphery. The image quality degraded at the periphery because light rays from the pixels entered the microlens at an angle at the edge of the display. A lens array with uniform microlens was used in this study; however, the quality of the 3D image at the periphery of the display could be improved by designing microlenses optimized for off-axis incidence [3]. Distortion and chromatic aberration are caused by the aberration of the eyepiece and may be suppressed by correcting the elemental images (see Supplement 1) and by using achromatic lenses [35].

 figure: Fig. 6.

Fig. 6. Experimental results of FOV. (a) Elemental images shown on the display and (b) photograph captured by the camera with a monocular optical bench prototype.

Download Full Size | PDF

4.3 Spatial frequency

Spatial frequencies were evaluated using a 3D image of the 1951 USAF resolution test chart, as shown in Fig. 7(a). The USAF chart was shown on the virtual image plane (1000 mm from the eyepiece) and sized such that the inside of the green frame was 1 cpd. Elemental images with a resolution of 1080 × 1080 pixels were generated by the same process as the FOV evaluation (Fig. 7(b)), and the retinal images were generated by the optical simulation described in Section 4.1. Elemental images were also shown on the display of the optical bench prototype and captured with a camera equipped with a 50 mm focal length lens placed in front of the eyepiece. The f-number was set to 11, to ensure that it was close to the pupil diameter. The effective aperture was calculated using the focal length of the camera lens and f-number as 50 / 11 = 4.5 mm. The retinal images obtained by the simulation and experiment are shown in Figs. 7(c) and (d), respectively. The area of the smallest pattern that could be shown without aliasing, which is indicated by the blue box, was 10.0 cpd for both the simulated and photographed results. This value was close to the theoretical value of 9.9 cpd calculated through Eq. (9). The block patterns and vertical lines were observed in the photograph in Fig. 7(d). The block patterns are the pixel structure of the display, and the periodic vertical lines are moiré resulting from the periodicity of the enlarged pixels with low horizontal aperture and from the periodicity of the enlarged elemental images. These patterns can be reduced by inserting a diffuser plate between the display and lens array.

 figure: Fig. 7.

Fig. 7. Retinal images of the USAF chart at the virtual image plane. (a) texture, (b) elemental images, (c) simulated retinal image, and (d) photograph.

Download Full Size | PDF

Figure 8 shows the retinal images obtained by optical simulation using the 3D images of the USAF chart at various depths other than the virtual image plane. Because the size of the USAF chart was adjusted such that the visual angle was constant regardless of depth, the green frame was 1 cpd at all depths. While generating the retinal image, the focal length of the crystalline lens was set such that the virtual eye was focused at the depth of the 3D image. The clarity of the retinal image was significantly reduced when the image was shown in front of the virtual image plane compared to when it was shown at the back of the virtual image plane. Additionally, as the USAF chart moved further from the virtual image plane, its clarity decreased. To investigate this trend, we evaluated the modulation transfer function (MTF) at various depths. The display, microlens, and eyepiece were arranged with the same performance as the monocular optical bench prototype described in Section 3, using the sequential mode of ANSYS Zemax OpticStudio. The point spread function (PSF) was calculated by tracing the light rays emitted from a point source through the microlens, eyepiece, and crystalline lens to reach the retina. The MTF at various depths was calculated from the PSF at the different focal lengths of the crystalline lenses (Fig. 9). The horizontal axis represents the spatial frequency, and the vertical axis represents the distance between the eyepiece and the focus position. At depths away from the virtual image plane, the MTF exhibits a trend similar to that shown in Fig. 8, with a significant decrease at higher spatial frequencies and a gradual decrease at lower spatial frequencies. These results suggest that the decrease in spatial frequency occurs because the circle of confusion, responsible for forming the 3D image, enlarges as one moves away from the virtual image plane. Furthermore, the spatial frequency is significantly reduced on the front side of the virtual image plane because the distance between the focus position and eye is smaller. At the Nyquist frequency (9.9 cpd), the range in which the MTF exceeds 0.2 is approximately 720–2450 mm.

 figure: Fig. 8.

Fig. 8. Simulated retinal images of the USAF chart at various depths. (a) 600 mm, (b) 800 mm, (c) 1200 mm, and (d) 1500 mm from the eyepiece.

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. Simulated MTF at various depth positions. The white dotted line is the depth of the virtual image plane.

Download Full Size | PDF

4.4 Accommodation

Computer-generated 3D scenes at different depths were displayed on a light-field HMD to demonstrate retinal blur. Elemental images with a resolution of 1080 × 1080 pixels were generated by the same process as in the FOV evaluation, and retinal images were generated at different focal lengths of the crystalline lens by performing the optical simulation described in Section 4.1. Elemental images were also shown on the display of the optical bench prototype, and a camera equipped with a 50 mm focal length lens placed in front of the eyepiece was used to capture images at different focus positions. The f-number was set to 11, similarly to the spatial frequency experiments. Figure 10 shows the elemental images of the 3D scene, simulated retinal images, and photographs of the optical bench prototype. The focal positions were 600 mm away from the eyepiece (front focus) and 1800 mm away from the eyepiece (rear focus). Characters and objects in the focus position are visible, whereas the out-of-focus depth is blurred. The photographs appear almost the same as the simulated retinal images, indicating that the 3D scene is displayed at the designed depth.

 figure: Fig. 10.

Fig. 10. Computer-generated 3D scenes.

Download Full Size | PDF

4.5 Eye box

To evaluate the size of the eye box, elemental images of the computer-generated 3D scene used in Section 4.4 were displayed on the optical bench prototype, and the images were photographed at different positions. A camera lens with a focal length of 50 mm was used, and the f-number was set to 11. The effective diameter was 50 / 11 = 4.5 mm. The camera was placed in front of the eyepiece of the optical bench prototype, and the camera position was changed in the plane perpendicular to the central axis of the eyepiece, whereas the eye relief was fixed at 37 mm. The camera was focused 1800 mm from the eyepiece. These photographs are shown in Fig. 11. When the camera was moved by over ±2.25 mm from the center axis of the eyepiece in the x-direction, vertical stripes appeared in the 3D image (Fig. 11(a)). When the camera was moved in the y-direction, horizontal stripes appeared in the 3D images (Fig. 11(b)). This crosstalk occurred because the rays emitted from the elemental image next to the target elemental image entered the microlens and reached the eye. As the effective diameter of the camera is 4.5 mm, the size of the eye box can be calculated as 2.25 × 2 + 4.5 mm = 9.0 mm. The theoretical value of the eye box, calculated using Eq. (10), is 9.06 mm, which is close to the experimental results. Within the eye box, no mismatch or discontinuity of the 3D image occurred when the camera position was moved, indicating that the pupil swim distortion has little effect.

 figure: Fig. 11.

Fig. 11. Photographs captured at different camera positions. (a) Horizontal movement, (b) vertical movement.

Download Full Size | PDF

5. Discussion

Experimental results from the optical bench prototype confirmed that the proposed light-field HMD with a medium-sized display has a wide FOV and high spatial frequency of 9.9 cpd and that the device performance, including the FOV, spatial frequency, and eye box, is consistent with the theoretical values. Our proposed method enables adjusting the spatial frequency and FOV while keeping the eye relief fixed by controlling the elemental image pitch. For example, by changing the eyepiece to one with a focal length of 65 mm, the FOV can be increased horizontally to 95.9° and vertically to 63.9°, although the spatial frequency is decreased to 6.6 cpd, as per Eqs. (7) and (9).

By applying the two proposed methods, we have developed a binocular head-mounted prototype that operates in real time along with head-movement, although the FOV is limited compared to that of the monocular optical bench prototype (Fig. 12). Detailed specifications and examples of operation are described in Supplement 1.

 figure: Fig. 12.

Fig. 12. (a) Binocular head-mounted prototype and (b) CG image of the head-mounted prototype.

Download Full Size | PDF

Several challenges exist related to the use of medium-sized displays. For example, the angle of incidence of light rays on the microlens increases at the periphery of the display, resulting in 3D image degradation. In addition, distortion and chromatic aberration occur due to the eyepiece. Therefore, the effects of optical aberration must be suppressed by configuring the peripheral part of the lens array with microlenses corresponding to off-axis incidence [3]. The eyepiece-induced distortion and chromatic aberration must also be suppressed by correcting the elemental images and using an achromatic lens [35]. The depth of the optical system is also slightly larger in HMDs for VR applications, even when an intermediate image is formed by a virtual image, because two types of lenses are placed coaxially in front of the face, in addition to the display. The depth of this optical system can be reduced by employing an optical path folding technique, which controls the polarization state using a quarter-wave plate and half-mirror lenses [4]. In the future, further improvements in pixel density and drive frequency for medium-sized displays or further increases in microdisplay size may lead to improved spatial frequency and wider FOVs for light-field HMDs. The proposed methods will contribute significantly to the realization of light-field HMDs that enable comfortable VR viewing.

6. Conclusion

In this study, we proposed a light-field HMD with a medium-sized display for VR applications. We also presented methods of controlling eye relief arbitrarily by changing the elemental image pitch and reducing the depth of the device by forming an intermediate image as a virtual one. We theoretically determined the FOV, spatial frequency, and eye box resulting from the application of these methods. Experimental results using a monocular optical bench prototype with a 5.5-inch-diagonal 4 K display showed that horizontal and vertical FOVs of 68.8° and 43.1°, respectively, could be achieved, and that the FOV, spatial frequency, and eye box were close to the theoretical values. Although simple lens arrays and Fresnel lenses were used as optical elements, the display performance is expected to improve further through the use of lens arrays designed for off-axis ray incidence or pancake lenses employing optical path folding techniques.

Disclosures

The authors declare no conflicts of interest.

Data availability

The data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. C. M. Kang and H. Lee, “Recent progress of organic light-emitting diode microdisplays for augmented reality/virtual reality applications,” J. Inf. Disp. 23(1), 19–32 (2022). [CrossRef]  

2. J. Xiong, E. L. Hsiang, Z. He, et al., “Augmented reality and virtual reality displays: emerging technologies and future perspectives,” Light: Sci. Appl. 10(1), 216 (2021). [CrossRef]  

3. J. Ratcliff, A. Supikov, S. Alfaro, et al., “ThinVR: Heterogeneous microlens arrays for compact, 180-degree FOV VR near-eye displays,” IEEE Trans. Visual. Comput. Graphics 26(5), 1981–1990 (2020). [CrossRef]  

4. K. Bang, Y. Jo, M. Chae, et al., “Lenslet VR: Thin, flat and wide-FOV virtual reality display using fresnel lens and lenslet array,” IEEE Trans. Visual. Comput. Graphics 27(5), 2545–2554 (2021). [CrossRef]  

5. A. Maimone and J. Wang, “Holographic optics for thin and lightweight virtual reality,” ACM Trans. Graph. 39(4), 67 (2020). [CrossRef]  

6. G.-A. Koulieris, B. Bui, M. S. Banks, et al., “Accommodation and comfort in head-mounted displays,” ACM Trans. Graph. 36(4), 1–11 (2017). [CrossRef]  

7. C. Chen, H. Deng, Q.-H. Wang, et al., “Measurement and analysis on the accommodation responses to real-mode, virtual-mode, and focused-mode integral imaging display,” J. Soc. Inf. Disp. 27(7), 427–433 (2019). [CrossRef]  

8. K. Akşit, W. Lopes, J. Kim, et al., “Near-eye varifocal augmented reality display using see-through screens,” ACM Trans. Graph. 36(6), 1–13 (2017). [CrossRef]  

9. K. Akeley, S. J. Watt, A. R. Girshick, et al., “A stereo display prototype with multiple focal distances,” ACM Trans. Graph. 23(3), 804–813 (2004). [CrossRef]  

10. J. P. Rolland, M. W. Krueger, and A. Goon, “Multifocal planes head-mounted displays,” Appl. Opt. 39(19), 3209–3215 (2000). [CrossRef]  

11. X. Duan, J. Liu, X. Shi, et al., “Full-color see-through near-eye holographic display with 80° field of view,” Opt. Express 28(21), 31316–31329 (2020). [CrossRef]  

12. J. Kim, M. Gopakumar, S. Choi, et al., “Holographic glasses for virtual reality,” in ACM SIGGRAPH 2022 Conference Proceedings (2022), 33.

13. A. Maimone, A. Georgiou, and J. S. Kollin, “Holographic near-eye displays for virtual and augmented reality,” ACM Trans. Graph. 36(4), 1–16 (2017). [CrossRef]  

14. C. Jang, K. Bang, G. Li, et al., “Holographic near-eye display with expanded eye-box,” ACM Trans. Graph. 37(6), 1–14 (2018). [CrossRef]  

15. W. Han, J. Han, Y.-G. Ju, et al., “Super multi-view near-eye display with a lightguide combiner,” Opt. Express 30(26), 46383–46403 (2022). [CrossRef]  

16. T. Ueno and Y. Takaki, “Approximated super multi-view head-mounted display to reduce visual fatigue,” Opt. Express 28(9), 14134–14150 (2020). [CrossRef]  

17. T. Ueno and Y. Takaki, “Super multi-view near-eye display to solve vergence–accommodation conflict,” Opt. Express 26(23), 30703–30715 (2018). [CrossRef]  

18. F.-C. Huang, K. Chen, G. Wetzstein, et al., “The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues,” ACM Trans. Graph. 34(4), 1–12 (2015). [CrossRef]  

19. C. Gao, Y. Peng, R. Wang, et al., “Foveated light-field display and real-time rendering for virtual reality,” Appl. Opt. 60(28), 8634–8643 (2021). [CrossRef]  

20. K. Akşit, J. Kautz, and D. Luebke, “Slim near-eye display using pinhole aperture arrays,” Appl. Opt. 54(11), 3422–3427 (2015). [CrossRef]  

21. H. Lee, U. Yang, and H.-J. Choi, “Analysis of the design parameters for a lightfield near-eye display based on a pinhole array,” Curr. Opt. Photon. 4(2), 121–126 (2020). [CrossRef]  

22. W. Song, Q. Cheng, P. Surman, et al., “Design of a light-field near-eye display using random pinholes,” Opt. Express 27(17), 23763–23774 (2019). [CrossRef]  

23. W. Song, Y. Wang, D. Cheng, et al., “Light field head-mounted display with correct focus cue using micro structure array,” Chin. Opt. Lett. 12(6), 060010 (2014). [CrossRef]  

24. C. Yao, D. Cheng, and Y. Wang, “Uniform luminance light field near eye display using pinhole arrays and gradual virtual aperture,” in 2016 International Conference on Virtual Reality and Visualization (ICVRV) (2016), pp. 401–406.

25. D. Lanman and D. Luebke, “Near-eye light field displays,” ACM Trans. Graph. 32(6), 1–10 (2013). [CrossRef]  

26. C. Yao, D. Cheng, T. Yang, et al., “Design of an optical see-through light-field near-eye display using a discrete lenslet array,” Opt. Express 26(14), 18292–18301 (2018). [CrossRef]  

27. D. Shin, C. Kim, G. Koo, et al., “Depth plane adaptive integral imaging system using a vari-focal liquid lens array for realizing augmented reality,” Opt. Express 28(4), 5602–5616 (2020). [CrossRef]  

28. H. Huang and H. Hua, “High-performance integral-imaging-based light field augmented reality display using freeform optics,” Opt. Express 26(13), 17578–17590 (2018). [CrossRef]  

29. P. Y. Chou, J. Y. Wu, S. H. Huang, et al., “Hybrid light field head-mounted display using time-multiplexed liquid crystal lens array for resolution enhancement,” Opt. Express 27(2), 1164–1177 (2019). [CrossRef]  

30. Z. Qin, Y. Zhang, and B. R. Yang, “Interaction between sampled rays’ defocusing and number on accommodative response in integral imaging near-eye light field displays,” Opt. Express 29(5), 7342–7360 (2021). [CrossRef]  

31. H. Huang and H. Hua, “Effects of ray position sampling on the visual responses of 3D light field displays,” Opt. Express 27(7), 9343–9360 (2019). [CrossRef]  

32. E.-L. Hsiang, Z. Yang, Q. Yang, et al., “AR/VR light engines: perspectives and challenges,” Adv. Opt. Photonics 14(4), 783–861 (2022). [CrossRef]  

33. F. Jin, J.-S. Jang, and B. Javidi, “Effects of device resolution on three-dimensional integral imaging,” Opt. Lett. 29(12), 1345–1347 (2004). [CrossRef]  

34. J.-S. Jang, F. Jin, and B. Javidi, “Three-dimensional integral imaging with large depth of focus by use of real and virtual image fields,” Opt. Lett. 28(16), 1421–1423 (2003). [CrossRef]  

35. Z. Luo, Y. Li, J. Semmen, et al., “Achromatic diffractive liquid-crystal optics for virtual reality displays,” Light: Sci. Appl. 12(1), 230 (2023). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       As your manuscript has been accepted, please submit a clean copy that we may process for publication. We would appreciate it if you would remove the indications of changes that you have made for the editors and reviewers.

Data availability

The data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (12)

Fig. 1.
Fig. 1. Schematic of the arrangement of the elemental images and the path of the chief rays emitted from the center of each elemental image. (a) Conventional arrangement of light-field HMD. (b) Proposed arrangement of light-field HMD.
Fig. 2.
Fig. 2. Schematic of the path of rays emitted from a point source on the display. An intermediate image is formed as a (a) real or (b) virtual image.
Fig. 3.
Fig. 3. Schematic of the path of rays emitted from each elemental image to the pupil plane. The green, yellow, and red rays indicate rays emanating from the top, center, and bottom of each elemental image, respectively.
Fig. 4.
Fig. 4. (a) Perspective and (b) side views of the monocular optical bench prototype.
Fig. 5.
Fig. 5. (a) Graduated texture, (b) plots of simulated and measured FOVs at various eye reliefs, and (c) simulated retinal images at various eye reliefs.
Fig. 6.
Fig. 6. Experimental results of FOV. (a) Elemental images shown on the display and (b) photograph captured by the camera with a monocular optical bench prototype.
Fig. 7.
Fig. 7. Retinal images of the USAF chart at the virtual image plane. (a) texture, (b) elemental images, (c) simulated retinal image, and (d) photograph.
Fig. 8.
Fig. 8. Simulated retinal images of the USAF chart at various depths. (a) 600 mm, (b) 800 mm, (c) 1200 mm, and (d) 1500 mm from the eyepiece.
Fig. 9.
Fig. 9. Simulated MTF at various depth positions. The white dotted line is the depth of the virtual image plane.
Fig. 10.
Fig. 10. Computer-generated 3D scenes.
Fig. 11.
Fig. 11. Photographs captured at different camera positions. (a) Horizontal movement, (b) vertical movement.
Fig. 12.
Fig. 12. (a) Binocular head-mounted prototype and (b) CG image of the head-mounted prototype.

Equations (11)

Equations on this page are rendered with MathJax. Learn more.

p EI = p a .
p EI = p a [ 1 + d a ( f ep d er ) d aep f ep d er ( d aep f ep ) ] ,
1 d a + 1 d i = 1 f a ,
1 d ep 1 d v = 1 f ep ,
1 d a 1 d i = 1 f a .
h = n 1 2 [ p a d aep d a ( p EI p a ) ] ,
α = 2 arctan ( min [ w ep 2 d er , h d er ] ) ,
P = m M p d ,
η = d v + d er 2 P π 180 .
w EB = p EI d aep d a ( 1 d er f ep + d er d aep ) .
α m = 2 arctan [ w 2 ( d v + d er ) ] ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.