MoMuSys (MObile MUltimedia SYStems) 

 

 

The Video Object Generation Tool with User Environment

VOGUE is an interactive tool for the creation of video objects suitable for MPEG-4 encoding.
 
Example of video object produced using VOGUE.
Top: some frames of the original image sequence;
Middle: masks created with VOGUE;
Bottom: final video object
 
 

Contents

  General description
  Technical description
  Demo-tutorial
  Publications
  Partners involved

General description

The Video Object Generation Tool with User Environment was developed in the framework of the ACTS098 MoMuSys project.

The recent MPEG4 standard supports content-based functionalities. This means that the different video objects (e.g. the presenter of a news program without the background) in a video sequence can, for instance, be individually coded and manipulated. However, the standard intentionally leaves open the issue of object definition, i.e the segmentation of a video sequence into different video objects, and this has created the need for the development of efficient segmentation algorithms.

Automatic segmentation is an ill-posed problem. The object of interest can be different for the same scene depending on the user or the application. Automatic segmentation is therefore a problem without a general solution, at least at the current state-of-the-art. User-assisted segmentation offers an attractive solution by letting the user to introduce semantic aspects while keeping an important part of the process automatic.

VOGUE is an integrated framework for user-assisted segmentation of video sequences. It integrates several different algorithms into a common graphical user interface. A fully interactive static segmentation algorithm allows the user to quickly create an initial mask defining the object of interest. Afterwards, the tracking algorithm will allow to follow the object throughout the video sequence. An algorithm for the detection of moving objects (temporal segmentation) is as well included, which is specially suited to those cases where the objects of interest are the moving objects. User interaction is possible during the whole process. The tracking or the temporal segmentation can be stopped at any time, to make corrections using the tools provided by the static segmentation module.

Figure 1 shows a picture of the graphical user interface. On the left-hand window, the original video sequence is displayed. On the right-hand window, the working window displays the state of the current segmentation in the form of colour labels. On the left-hand part of the working window, the colour labels are superimposed to the luminance of the original image for better evaluation of contour quality.

Figure 1. VOGUE Graphical User Interface.


Technical description

Spatial segmentation

The spatial segmentation algorithm permits to define the segmentation mask of the object(s) of interest. It is based on a multi-scale segmentation scheme([5,6]). A family of nested partitions is constructed (Fig. 2). The coarsest level considers the image as a whole (as only one region) and finer partitions are always included in coarser ones. This means that a finer level is obtained by re-segmentation of regions of the previous level. The user is then given the tools to navigate between the different resolution levels of the family, to create the desired partition (Fig. 3).
 
As an alternative, the user can select the regions of the partition just roughly drawing a marker for each object of interest  and a marker for the background. The spatial segmentation algorithm finds automatically the real contours of the selected objects ( Fig. 4).
 
 

Tracking

Once the segmentation mask is available for an initial image, it can be automatically extended to the following images of the sequence. For this purpose a tracking algorithm has been implemented. It is based on a partition projection allowing the introduction of new regions, followed by a decision on the regions that belong to the new mask. The result of the automatic tracking is displayed to the user, who has the possibility to stop the execution and ask for refinements of the object mask . The user corrections will then be used by the automatic algorithm to improve its subsequent performance.  Figure 5  shows video objects obtained with the tracking algorithm.
 

Temporal segmentation

The temporal segmentation is useful when the user is interested in moving objects. It is based on a change detection followed by a motion analysis. It consists in the following steps: (1) camera motion estimation and compensation assuming that the background of the scene is a rigid plane, (2) scene-cut detection based on the evaluation of the mean absolute difference between the camera motion compensated previous frame and the current frame, (3) estimation of change detection mask by thresholding the frame difference between two successive frames, (4) uncovered background elimination analyzing a displacement vector field, (5) contour adaptation to the luminance edges of the current frame in order to get a more accurate object boundary.  Figure 6 shows example of results obtained using temporal segmentation.
  

Demo-tutorial

This demo-tutorial will guide you through VOGUE's main functionalities. Click here to see the tutorial.


Publications

Multimedia
  1. B. Marcotegui, P. Correia, F. Marques, R. Mech, R. Rosa, M. Wollborn, F. Zanoguera "A Video Object Generator Tool Allowing Friendly User Interaction" in ICIP-99, Kobe (Japan), October 1999. pdf format (256 Kbytes)
  2. P. Correia and F. Pereira. "User Interaction in Content-Based Video Coding and Indexing" in EUSIPCO-98, Rhodes, Greece. Sept 1998.
  3. P. Correia and F. Pereira. "The Role of Analysis in Content-Based Video Coding and Indexing", Signal Processing-Special Issue on Video Sequence Segmentation for Content-Based Processing and Manipulatation. Vol 66, No. 2. pp. 125-142. April 1998.
Spatial segmentation
  1. C. Vachier and F. Meyer, "Extinction Values: A New Measurement of Persistence", IEEE Workshop on Non Linear Signal/Image Processing, pp. 254-257, June 1995.
  2. F. Meyer, "Morphological multiscale and interactive segmentation", IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing, Antalya. Turkey, June 1999.
  3. F. Zanoguera, B. Marcotegui, F. Meyer. "A Toolbox for Interactive Segmentation Based on Nested Partitions" ICIP-99, Kobe (Japan), pdf format (77 Kbytes)
Tracking algorithm
  1. F. Marques and M. Pardas and P. Salembier, Video Coding: The second generation approach, chapter  Coding-oriented segmentation of video sequences, pp. 79-124. L. Torres and M. Kunt (Eds). Kluwer Academic Publishers, 1996.
  2. F. Marques and J. Llach, "Tracking of generic objects for Video Object generation", IEEE International Conference on Image Processing, Chicago, USA, October 1998.
Temporal segmentation
  1. R. Mech and M. Wollborn. "A Noise Robust Method for 2D Shape Estimation of Moving Objects in Video Sequences Considering a Moving Camera", Signal Processing, Vol 66, No 2, pp 203-217, 1998.
  2. ISO/IEC JTC1/SC29/WG11 Doc. 2502. Information Technology - Generic Coding of Audio-visual Objects: Visual, ISO/IEC 14496-2, Final Draft of International Standard, Annex F. October 1998.
 
 


Partners involved