Marker-based Positional Tracking

Project Motivation

In this project I wanted to explore the usability of a inside-out tracking system for natural movement in virtual space. I have decided for wearable cameras mounted to the HMD and tracking the environment due to the resulting low hardware complexity and low costs. This setup is not new. It is called Inside-Out Tracking in the literature Virtual Reality. However, high (wearable) computing power as well as fast and cheap cameras enable better tracking quality than ever before.

There are alternative approaches for tracking in large-scale environments. One option is tracking using a set of calibrated cameras (Outside-In Tracking) which track markers fixed to the clothes of the user. This is the traditional setup for professional Motion Capture for movies and games. However, a large number of expensive cameras (high spatial and temporal resolution) as well as high performance computers are required to achieve the tracking quality required for Virtual Reality. Having a look the price list from Vicon, Optitrack and A.R.T. makes you thinking about alternatives.

A set of Microsoft Kinect cameras costs less than 1k USD which is feasable for lots of applications. However, the tracking quality largely depends on the distance of the user to the Kinect camera. If multiple Kinects are used they begin to distort each other which is definitely a problem. There are ways out of it (vibrating patterns) but the main problems remain: low framerate (30 Hz) and a high number of required cameras to create a sufficiently resoluted tracking grid at every point in space.

A very promising technology is the lighthouse tracking system from Valve. The 3D spatial laser-tracking system achieves millimeter precision with lowest latency (milliseconds). However not much was known about the technology and it has not been available when I started this projected, and again it requires significant hardware effort which is not easy to rebuild.

So I started tracking test with multiple binary markers which you can see at the walls in the topmost picture. Those binary markers provide an affordable yet effective tracking approaches.

Preliminary Work

This photo shows an early version of the marker setup with a sparse set of tracking markers. The setup suffered from tracking ‘leaks’ and a non-uniform tracking precision since the precision depends on the size of marker in camera coordinates. Therefor, in the later version I have decided to use more markers with different scales to increase tracking robustness and to achieve a more uniform tracking accuracy in space.

Our 'mobile' setup.
The basic ‘mobile’ setup.
The system allows navigation interaction in VR.
The system allows navigation and interaction in VR.

The next videos show different tests with marker tracking algorithms and markerless tracking algorithms provided by the MetaioSDK (which is unfortunately not supported any longer).

You can see one of my colleagues testing the Outside-In Tracking approach with markers fixed to a Head-mounted Display. The tracking data is used in a game engine (Unreal Engine in this case) to control the camera of the user. The user therefor gets immersed since the virtual camera moves and rotates in the same way as the head of the user.

The next video shows tests with a markerless tracking algorithm provided by the MetaioSDK. The algorithm works by tracking 2D features (yellow dots) which can be projected into 3d space when the camera moves over time. From those 3d features the camera pose is estimated. As you can see the tracking is working if enough known features are visible. However, the tested markerless tracking algorithm is not very robust against occlusions and lighting changes.

The resulting tracking quality is sufficient for lots of Augmented Reality applications. For VR however, jaggy camera motions are very disturbing. Nevertheless, the results have been promising in terms of tracking speed and latenc, but the accuracy is limited. Filtering the data improves the robustness but also introduces latency which in turn increases the danger of getting motion sick. I have therefor added a self-made Arduino-based orientation tracker to replace the rotational tracking component. Due to the high temporal resolution and high accuracy of the orientation tracker motion sickness is largely reduced. I have averaged the positional component over several frames which still achieves a tolerable amount of latency for positional tracking.

After these promising tests I decided to use head-mounted cameras tracking in combination with marker-based tracking in a student project. The goal was to create a wearable system which enables indoor navigation in a constrained but walkable environment.

Marker-based Indoor Navigation

The setup is designed for free navigation in VR. The user carries a vest with the required attached hardware. A laptop in the backpack performs tracking and rendering. Interaction controllers allow triggering actions in the VR application. The rendered scene is presented using a head-mounted display which also delivers orientation tracking. The positional tracking is performed by two head-mounted cameras tracking the markers in the environment.

The marker-based setup with body tracking support.
The marker-based setup with body tracking support.
The 'mobile' setup.
The ‘mobile’ setup.

The development of the wearable setup as well as the required algorithms had to be finished within one semester. The results are far from perfect but also not too bad and easy to extend for a larger room (just take more markers!).

We arranged a dense marker setup a room of 4x5x3 meters which include markers of different sizes so that the tracking gets robust independent from the distance from the user to the wall. You can see the marker setup in the title image (use a 360 photo viewer if you want to explore the room!).

Binary Tracking Marker
Binary Tracking Marker

Tracking a binary marker meaning computing the position, scale and rotation in 3d can be implemented very efficiently and robust against lighting changes. Since the configuration of the black squares is unique for every marker and the size of the marker is known a single marker is sufficient to compute the location of the marker as well as the camera in space which is the traditional way of tracking in lots of Augmented Reality applications.

For tracking the marker you basically have to extract the corner points of the markers which is usually performed by a fast corner detection step. After that the corners are combined into corner sets of single markers. The corners are reprojected to get a frontal view onto the marker. From the frontal view the configuration of the corners the ID of the marker can be estimated. This ID is connected to a calibrated marker position, scale and orientation in space. This information in turn allows computing the camera and head pose of the user in our setup.

For tracking we used two Playstation Eye Cameras. Those cameras are very cheap and still provide a resolution of 75Hz@640×480 at 11 milliseconds latency. This is by far better than most Webcams which cost more.

Playstation Eye Camera
Playstation Eye Camera

We created a 3d-printed mount for the Oculus DK1 and decided to align one camera to the front and one to the sky within a 90 degrees angle so that we track multiple markers at the same time which is not too costly even with a large amount of binary markers. This approach enables to increase the tracking robustness and also allows tracking when looking to the floor where we have no markers.

Rift DK1 with dual camera mount
Rift DK1 with dual camera mount

We also wanted to include some form of interaction in the virtual space. Therefore, we added the Razor Hydra to the setup. This interaction device basically provides two hand controllers which are tracked in six degrees of freedom (rotation + position) relative to the controller base being attached to the vest of the user.

Razer Hydra
Razer Hydra

The achievable tracking resolution is strongly dependent from the accuracy of marker position estimation. If multiple markers are tracked but the location of each marker is slightly wrong the averaged camera position derived from each marker will also be wrong. We therefor took high resolution images using a Canon 5D with a 50 mm which has little lens distortion. We developed an algorithm to estimate the marker positions automatically from those input images. The resulting marker positions are then used for our real-time tracking algorithm. The calibrated marker set in 3d is shown in the next video.

The final result of this project becomes visible with the following video from the ‘Tag der jungen Softwareentwickler’ (day of young software developers):


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s