CAU - Mixed Reality System for 3D TV Applications
Mixed Reality is the combination of real and vitual added content. For 3-TV applications a consistent depth map is essential for 3D perception. This work concentrates on generating mixed intensity and depth images of real and virtual content. It handle mutual occlusion handling, mixing, content placement and shadowing of virtual content.


The ToF- and CCD-camera mounted to Pan-Tilt-Unit / Input image of target camera
The steps of the processing chain:
- Environment model generation with Time-of-Flight camera.
- Camera pose determination
- Shadow computation
- Depth-keying and mixing
- Tracking of dynamic foreground objects and content alignment
Environment model generation
For the generation of a background model a ToF- and a CCD camera is used. With a Pan-Tilt-Unit a sweep is performed to capture depth and intensity information. A full 3D-model is generated covering 270x180 degrees.

Background model: outer and inner view
Camera pose determination
The camera is allowed to move around and the camera movement has to be estimated. This is done using analysis-by-synthesis tracking with background model and live image. Point correspondences are detected and tracked over the sequence.
Shadow Computation
Shadowing is essential for realistic appearance. Shadow mapping is applied. With the usage of the background model shadows can be computed between virtual objects and real environment.


The shadow map, note the shadows on the background model. Shadow map applied to real image and virtual content
Depth-keying and mixing
Depth-keying is done on the GPU using shaders. The depth maps are generated and mixed/segmented on the GPU. The current ToF-depth image,a rendered image of the background model and a rendered image of the virtual contents. Keying foreground and background is solely done on depth values, so no chroma keying facilities are needed.


Mixed depth images with virtual content (Statue of Liberty and car in front)
Tracking of dynamic foreground objects and content alignment
While mixing the 3 depth images on the GPU dynamic foreground objects are simultaneously keyed and segmented.

The center-of-mass is determined and the 3D-position of the person can be computed and projected to the ground floor.
This helps in interactive content placement.

Persons trajectory on floor and placed 3D-models
