What do we need for convincing VR experiences using an HMD ? To answer that we have to consider the properties of the human vision and convert them into technical requirements for the display. For most of the following aspects, you can keep in mind the simple rule: more is better.
An HMD display first needs a wide FoV, at least 100° horizontally and vertically per eye. Then the immersion begins getting convincing. Current HMDs provide this by using a pair of aspheric biconvex lenses in front of the users eye. So this point can be marked carefully as ‘checked’.
Next, temporal resolution, the refresh rate in other words. Current HMDs support 75 Hz to 96 Hz and combine this with special strobing techniques so that the eye is illuminated shorter than the integration time, at least in the foveal region.
This technique can largely reduce studdery motion, motion blur and motion sickness. However, if the eye is moving over the display retinal integration effects should be considered in the future as presented by our group, which we call the perceptual motion blur. The ideal display should have about 1 kilo Hz. But anyways, the refresh rate is already quite good so we also carefully check this point.
Next, spatial resolution meaning the number of pixels on the screen.
Since the human eye has a pretty high resolution in the foveal region, HMDs need a display that allows a high pixel density. Current HMDs range between 1 and 2.5k per eye, but 16k would be required for hiding the pixel grid completely reasoned by the strong magnification of the lenses for the wide field of view. So in fact, better displays are required at this point. As an alternative there is related work on exploiting perceptual properties of the eye to enhance the perceived resolution over the physical display resolution. It might be useful for HMDs but those approaches requires real-time gaze tracking.
Instant visual feedback. This is a crucial step and at the same time a very tough one for the renderer. Rendering the output images has to happen within a few milliseconds, the less the better. It still has to be photorealistic if the goal is to recreate reality. This requires cutting edge GPU power and a highly optimized rendering engine as well as appropriate footage to be rendered.
Photorealistic scenes that have to be rendered in real-time and that have to reach the users eyes nearly instantly. Depending on the application the tolerable latency ranges within 20 to 40 ms. The less the better. Otherwise you cannot trick the human visual system and the sense for orientation and you will suffer from motion sickness. Therefor rendering and head tracking have to be as fast as possible. For complex scenes it requires powerful machines and lots of footage optimization to reach the latency limits. I think at this point even with new GPUs rendering for the HMD is not yet solved. One promising way to reach the latency limits is foveated rendering by reducing the resolution in less important regions of the viewing area, but again this requires active eye-tracking.
Rendering from the correct viewpoint requires precise tracking of the head within milliseconds with six degrees-of-freedom and with millimeter accuracy. Therefor current systems include external cameras additional active tracking technology. However, the viewpoint is dependent from the physiology of the human head, which is user-specific and requires therefore user calibration. Eye-tracking significantly simplifies the calibration procedure.
Depth perception in the scene is another important requirement. It is currently achieved by showing the virtual environment from two different viewpoints, one for each eye. However, the virtual viewpoint depends on the location of the eye relative to the screen, which requires user calibration. This calibration task can be tedious, but with eye-tracking it could be largely simplified.
Natural accommodation is another property of the human visual system that is hard to reproduce. This term describes the natural adaptation of the eyes lens so that an object gets into focus, the same as a camera does. Most HMDs do not support this and set the focus to infinity, which doesn’t feel and look natural for many scenes. However, there is promising work on this by Narain and Huang on this years Siggraph using light field techniques. A simulation of this effect however, would be possible with gaze-tracking.
Light adaptation for high dynamic range imaging is another effect being more or less unsupported. In reality the eye adapts to day light as well as to dark environments. Many render engines use global tone mappers to recreate some kind of light adapation effect, but those just provide a coarse approximation. Gaze-tracking would enable spatial light adaptation which is much more natural.
As a last point on my list of requirements for immersion I mention naturally looking actors. There is amazing work on precisely capturing faces and realistically rendering avatars (e.g. Debevec et al. at USC). But those captured actors also have to behave naturally as virtual avatars and therefor eye movements are absolutely necessary. Again, real-time gaze tracking would be helpful in this case.
In summary, the perceived display quality as well as the naturalness of the VR experience could be largely improved with knowledge of the gaze direction of the user. A suitable real-time gaze-tracking solution is therefor the critical missing component for HMDs.
The thing is, tracking the eyes within an HMD is difficult since space is very limited. Currently available solution use rather miniaturized hardware and are therefore expensive as you can see here. Others cut holes into the lenses and therefore reduce the available field of view. The FOVE is a new interesting HMD with eye-tracking and is announced for next year, but not much is known about the used technology.
Here you can see our working prototype, which cost less than 400 dollars. We planned and assembled the HMD completely from scratch and implemented optimized algorithms for pupil detection and calibration. And importantly we will publish the 3d design sketches so that everybody can 3d print his own HMD.
Introduction << >> Gaze-Tracking Concept