The 15th International Conference on Vision Interface
May 27-29, 2002, Calgary, Canada
www.visioninterface.org/vi2002
 

Proceedings on-line 

Overview Foreword Program Committee Table of Contents Author Index Abstracts

Résumés des Articles / Paper Abstracts


s1.1

Title
An Adaptive-Sampling Algorithm for Object Representation
Authors
Robert Alterson, Dept. of Computer Science, York University
Minas Spetsakis, Dept. of Computer Science, York University
Abstract
We present a novel adaptive-sampling algorithm for spectral signature generation. This algorithm is designed to increase inter-object discrimination and reduce feature-vector dimensionality. Our algorithm is applied to a Gabor-feature based multi-resolutional object detection and recognition scheme. In this context we study and analyze the detection and identification of unknown objects in a complex background. Iterative, off-line optimization methods are employed to reduce computational demands during the learning phase. Our representation scheme takes into account all items in a given object library. It selects sample-point sets that maximize the inter-object distance. Thus, the presented method increases identification robustness and can reduce the size of signature vectors.

s1.2

Title
Region-Based Image Retrieval using Wavelet Transform
Authors
Nobuo Suematsu, Faculty of Information Sciences Hiroshima City University
Yoshihiro Ishida, Fuji-film Software Corp.
Akira Hayashi, Faculty of Information Sciences Hiroshima City University
Toshihiko Kanbara, Faculty of Information Sciences Hiroshima City University
Abstract
Content-based image retrieval, which provides convenient ways to
retrieve images from large image databases, has been studied actively.
While many previous image retrieval techniques do not look at
regions in an image, region-based image retrieval techniques
have been gaining attention recently.
We propose a region-based image retrieval method which performs
image segmentation and indexing using texture features computed from
wavelet coefficients.
The proposed method has advantages
in texture feature extraction and hierarchical image segmentation
over the previous region-based techniques using wavelet transform.

s1.3

Title
Semantics Retrieval by Content and Context of Image Regions
Authors
Wei Wang, Yuqing Song and Aidong Zhang
Department of Computer Science and Engineering
State University of New York at Buffalo
Buffalo, NY 14260 USA
Abstract
We propose a novel approach for semantics retrieval from images in
multimedia databases. In our approach, we use color-texture
classification to generate the codebook which is used to
segment images into regions. The content of a region
describes the lower-level features of the region, including color and
texture. The context of regions in an image describes their
relationships in the image.
The content and context of image regions provide a way for semantics
retrieval. On top of semantics retrieval, high-level
(semantics-based) querying and query-by-example are supported.
The experimental results
demonstrate that our approach outperforms the traditional CBIR approaches.

s1.4

Title
Visual Abstraction of Wildlife Footage using Gaussian Mixture Models
Authors
David Gibson, University of Bristol
Neill Campbell, University of Bristol
Barry Thomas, University of Bristol
Abstract
In this paper, we present a novel approach for clip-based key
frame extraction. Our framework allows both clips with subtle changes
as well as clips containing rapid shot changes, fades and dissolves to
be well approximated. We show that creating key frame video
abstractions can be achieved by transforming each frame of a video
sequence into an eigenspace and then clustering this space using
Gaussian Mixture Models (GMMs). An iterative process computes a GMM
configuration that best clusters the data based on a maximum likelihood
threshold. The image nearest to the centres of each of the GMM
components are selected as key frames. Unlike previous work this
technique relies on global video clip properties and results show that
the key frames extracted give a very good representation of the overall
clip content. We show that, by using a single threshold, an operator
can easily control the number of representative key frames generated.
We also demonstrate that clustering in eigen-time space improves the
video abstractions in a quantifiable manner and we demonstrate the
application of this technique on a database of 307 clips of wildlife
footage containing dissolves, shot changes, fades, pans, zooms and a
wide range of animal behaviours.

s1.5

Title
Motion Estimation by Object-Matching for Real-time Object-Based Video Representation
Authors
Aishy Amer: INRS-T¨¦l¨¦communications
Amar Mitiche: INRS-T¨¦l¨¦communications
Eric Dubois: Universit¨¦ d'Ottawa
Abstract
Motion estimation plays a key role in many video applications, such as frame-rate video conversion, video retrieval, video surveillance, and video compression.
The key issue in these applications is to define appropriate representations that can efficiently support motion estimation with the required accuracy.
In this paper, a low-complexity object motion estimation technique is proposed that is designed to fit the needs of content-based video representation for video surveillance and retrieval applications. In these applications, a representation of object motion in a way meaningful for high-level interpretation, such as event detection and classification, foregoes precision of estimation. The proposed method relies on the estimation of the displacements of the minimum bounding box (MBB) sides of an object.
Two motion estimation steps are proposed: initial coarse estimation to find a single displacement for an object using the four sides of the MBB between two successive images and detection of non-translational motion and its estimation. The result is the detection of the type of object motion and the subsequent estimation of one or more motion values per object depending on the detected motion type. Special consideration is given to object motion in interlaced video and at image margin. Various simulations show that the proposed method provides a response in real-time and gives good estimates to use for object tracking, event detection, and high-level video representation.

s1.6

Title
Hierarchical Indexing Images Using Weighted Low Dimensional Texture Features
Authors
Mira Park, University of NSW, Australia
Jesse S. Jin, University of Sydney, Australia
Laurence S. Wilson, CSIRO, Australia
Abstract
This paper introduces a new method to analyse the image texture and to index the image database. We present a new strategy to reduce the computational time to extract image features with high retrieval accuracy. We also propose a method to reduce the image feature dimension, so any robust indexing methods can be used. By weighting the extracted image features, a system may perceive the image consistent with human perception. We use two spaces to keep the key images and the candidates images for an efficient indexing of the image database.

s1.7

Title
Generation of Images of Historical Documents by Composition of their Components
Authors
Carlos A.B. Mello, UFPE
Rafael D. Lins, UFPE
Abstract
This paper describes a system for processing of historical documents. First, documents are decomposed into their components (such as paper texture, colours, typewritten parts, handwritten parts, pictures, etc). This decomposition process allows for efficient storage, indexing and network transmission, by factoring out common components. Document retrieval forces the re-assembling of the document, synthetising an image visually close to the original document. The information needed to build the final image occupies, in average, 2 Kbytes performing a very efficient compression scheme.

s2.1

Title
Motion Segmentation and Tracking
Authors
King Yuen Wong, Dept. of Computer Science, York University
Minas E. Spetsakis, Dept. of Computer Science, York University
Abstract
This paper presents a novel motion segmentation and tracking
algorithm. The tracking is done by fitting successively more elaborate
models of optical flow on the tracked region and the new tracke
region is computed by a fast sum of squared differences statistic on
the previous images properly aligned. The method can track objects in
image sequences with moving background, taken by a hand-held camera,
tolerate up to 30 pixels interframe motion and takes 0.3 second per
frame pair of size 320 x 240 pixels on a 500 Mhz Sun Blade 100
workstation.

s2.2

Title
Automated Visual Surveillance Using Hidden Markov Models
Authors
Vinod Nair, Center for Intelligent Machines, McGill University, Montreal PQ H3A 2A7

James J. Clark, Center for Intelligent Machines, McGill University, Montreal PQ H3A 2A7
Abstract
This paper describes an automated visual surveillance system that detects suspicious human activity in a scene. The system is designed to: 1) detect and track people in the scene, 2) recognize the "normal" activities in the scene, and 3) detect anomalous activity by finding sufficiently large deviations from the normal activity patterns. The stochastic time-sequence recognition framework of the Hidden Markov Model (HMM) forms the basis of activity recognition and anomaly detection. We have implemented the system to monitor an office corridor in real-time using a Pentium III machine running Windows 2000. The results show that the system correctly classifies examples of normal activities in the corridor and identifies a mock break-in attempt as suspicious activity.

s2.3

Title
Real-Time Tracking for Visual Interface Applications in Cluttered and Occluding Situations
Authors
David J. Bullock, University of Guelph
John S. Zelek, University of Guelph
Abstract
Visual interface systems require object tracking techniques with real-time performance for ubiquitous interaction. A probabilistic framework for a visual tracking system which robustly tracks targets in real-time using color and motion cues is presented. The algorithm is based on particle filtering techniques of the ICondensation filter. An innovation of the paper is the use of motion cues to guide the propagation of particle samples which are being evaluated using color cues. This results in a probabilistic blob tracking method which is shown to greatly outperform conventional blob trackers when in the presence of occlusion and clutter. The technique is applied to the task of video annotation using a hand-held marking device.

s2.4

Title
Multi-object Motion Pattern Classification for Visual Surveillance and Sports Video Retrieval
Authors
Akira Hayashi, Faculty of Information Sciences Hiroshima City University
Ryuji Nakasima, Faculty of Information Sciences Hiroshima City University
Toshihiko Kanbara, Faculty of Information Sciences Hiroshima City University
Nobuo Suematsu, Faculty of Information Sciences Hiroshima City University
Abstract
We present a scene activity pattern learning and recognition method
for visual surveillance and video retrieval. While scene activity
patterns are represented in terms of object trajectories in previous
work, they are represented in terms of the instantaneous motions of
multiple objects in each image in our method. In order to deal with
variable number of objects in a scene, we propose to use moment
statistics as features. Our approach is based on clustering, a form of
unsupervised learning, and needs little human
intervention. Furthermore, the probabilistic model based clustering
makes it easy to detect abnormality.

s2.5

Title
A Novel Probability Model for Background Maintenance and Subtraction
Authors
Dongsheng Wang , National Laboratory of Pattern Recognition,Chinese Academy of Science.
Tao Feng , Microsoft Research Asia
Harry Shum , Microsoft Research Asia
Songde Ma , National Laboratory of Pattern Recognition,Chinese Academy of Science.
Abstract
¡°Background maintenance and subtraction¡± is a common element of many computer vision applications. This paper introduces a novel model for background. This model includes two components and it processes the video sequence at pixel level and frame level alternatively. The advantage of this model is that it can capture both the temporal and spatial context of the video sequence. At pixel level, any probability model for pixel process can be used. And at frame level, we use Markov Random Field. Then, for a particular condition ¨C video surveillance on freeway, we propose a new pixel level model¨Cadaptive HMM. For HMM, both offline and online learning algorithm are discussed. For MRF, we discussed Belief Propagation algorithm. In our experiments about the video surveillance on freeway, the model can solve the problems encountered: bootstrapping, gradual change of illumination, and it can detect both moving vehicles and shadows.
Finally, we give our views about background maintenance and subtraction derived from our practice.

s2.6

Title
Robot Navigation Using Panoramic Landmark Tracking
Authors
Mark Fiala, Department of Computing Science, University of Alberta
Dr. Anup Basu, Department of Computing Science, University of Alberta
Abstract
A vision based navigation system is presented for determining a mobile robot's position and orientation using panoramic imagery. Omni-directional sensors are useful in obtaining a 360 degree field of
view, permitting objects in the vicinity of a robot to be imaged simultaneously. Recognizing landmarks in a panoramic image from an a priori model of distinct features in an environment allows a robot's location information to be updated. A system is shown for tracking vertex and line features for omni-directional cameras constructed with catadioptric (containing both mirrors and lenses) optics. With the aid of the Panoramic Hough Transform, line features can be tracked without restricting the mirror geometry to that which satisfies the single viewpoint criteria. Two paradigms for landmark tracking are explored, with experiments shown with synthetic and real images ed. A working implementation on a mobile robot is shown.

s2.7

Title
Wavelet Based Resolution Enhancement of Omnidirectional Image
Authors
Peng Qimin
Jia Yunde
Computer Science Department, Beijing Institute of Technology
Abstract
This paper proposes an approach for resolution
enhancement of omnidirectional images based on
wavelet transform. The resolution enhancement of
an image is achieved by using local extrema
extrapolation of wavelet coefficients under a
degradation model. The fusion operation is applied
to the coefficients of registered pixels in the
enhanced images of an image sequence. A fine
resolution enhancement image reconstructed via
inverse wavelet transform is discussed. The
experimental results show the proposed approach is
feasible and efficient for resolution enhancement of
omnidirectional image.

s2.8

Title
The Detection of Obstacles Using Features by the Horizon View Camera
Authors
Ayami Iwata, Gifu University
Kunihito Kato, Gifu University
Kazuhiko Yamamoto, Gifu University
Abstract
In this paper, we propose a new camera system called Horizon View Camera (HVC). The HVC is a system in which the optical axis of a camera is directed at the horizon with a mirror so that obtained image contains objects on the ground without including the ground itself. Therefore, by using the HVC system, separating objects from the ground becomes very easy. Moreover, there are many other useful features in the HVC system. In order to improve the processing speed and accuracy, we propose a new idea whereby the detection of objects becomes easier and the results are more accurate.

s3.1

Title
An Empirical Study of Some Feature Matching Strategies
Authors
Etienne Vincent, S.I.T.E., University of Ottawa
Robert Laganiere, S.I.T.E., University of Ottawa
Abstract
Several algorithms are proposed in the literature to solve the difficult problem of feature point correspondence between images. These methods make use of different properties of image point pairs,
in order to improve the quality of the matching.
This paper proposes an empirical evaluation of their performance,
and presents some new matching constraints. The validation
process used here determines the number of good matches and the proportion of good matches in a given match set.

s3.2

Title
A Fast Area-Based Stereo Matching Algorithm
Authors
Luigi Di Stefano
DEIS, University of Bologna

Massimiliano Marchionni
DEIS, University of Bologna

Stefano Mattoccia
DEIS, University of Bologna

Giovanni Neri
DEIS, University of Bologna
Abstract
This paper presents an area-based stereo algorithm suitable to real time applications. The core of the algorithm relies on the uniqueness constraint and on a matching process that allows for rejecting previous matches as soon as more reliable ones are found. In this manner unreliable disparity measurements are detected and discarded. The proposed approach is compared with the matching process based on the left-right consistency constraint, being the latter the basic method for detecting unreliable matches in many area-based stereo algorithms. The algorithm has been carefully optimised to obtain a very fast implementation on a Personal Computer. This paper describes the computational optimisation strategy, which is based on a very
effective incremental calculation scheme yielding massive elimination of redundant calculations. Finally, we provide experimental results obtained on stereo pairs with ground-truth as well as computation-time measurements; we compare these data with those obtained using a well-know, fast,area-based algorithm relying on the left-right consistency constraint.

s3.3

Title
Three-dimensional structure calculation: achieving accuracy without calibration
Authors
B. Boufama
School of Computer Science
University of Windsor
Windsor, On., Canada N9B 3P4
Abstract
This paper addresses the problem of computing the camera motion and the
three-dimensional structure of a scene using two uncalibrated images as
inputs. The camera motion is calculated by estimating the essential matrix
and using approximate values, easily available, for the intrinsic parameters.
The classical eight-point algorithm to calculate the essential matrix is known to be very sensitive to pixel-noise even when the intrinsic parameters are perfectly known. This paper shows that by using the normalized eight-point algorithm, aimed at calculating the fundamental matrix, the pixel-noise sensitivity is reduced significantly. More importantly, we show that the intrinsic parameters do not have to be accurately known in order to get very good quality reconstruction. In particular, we have investigated and compared the effect of errors on the intrinsic parameters together with pixel-noise on the calculated
motion/structure, when using the straightforward eight-point algorithm and its normalized version respectively.

S3.4

Title
A Stereo Confidence Metric Using Single View Imagery
Authors
Geoffrey Egnal, GRASP Laboratory, University of Pennsylvania
Max Mintz, GRASP Laboratory, University of Pennsylvania
Richard P. Wildes, Centre for Vision Research, York University
Abstract
Although stereo vision research has progressed remarkably, stereo
systems still need a fast, accurate way to estimate confidence in
their output. In the current paper, we explore using stereo
performance on two different images from a single view as a confidence
measure for a binocular stereo system incorporating that single view.
Although it seems counterintuitive to search for correspondence in two
different images from the same view, such a search gives us precise
quantitative performance data. Correspondences significantly far from
the same location are erroneous because there is little to no motion
between the two images. Using hand-generated ground truth, we
quantitatively compare this new confidence metric with five commonly
used confidence metrics on a uniform basis. We explore the performance
characteristics of each metric under a variety of conditions.

s3.5

Title
Limiting the Search Range of Correlation Stereo Using Silhouettes
Authors
Geoffrey Egnal, GRASP Laboratory, University of Pennsylvania
Max Mintz, GRASP Laboratory, University of Pennsylvania
Kostas Daniilidis, GRASP Laboratory, University of Pennsylvania
Abstract
We combine two basic approaches to 3D reconstruction: silhouette-based
and correspondence-based approaches. The two approaches have
complementary costs and benefits. Silhouette-based approaches deliver
volumetric descriptions which often have very few outliers, but they
cannot reconstruct concave surfaces. Correspondence-based approaches
give surface descriptions with sub-pixel accuracy, but their search
range either allows outliers or falls short of the correct match. We
show that a combination of the two can deliver fine-grained accuracy
with few outliers. Our specific implementation uses the silhouette
reconstruction as prior data to center and bound a stereo search
process. We explore the different performance characteristics of the
three methods qualitatively and quantitatively using real imagery.

s3.6

Title
Towards a fast and reliable dense matching algorithm
Authors
K. Jin and B. Boufama
University of Windsor
Abstract
In this paper, we present a dense matching algorithm which utilizes
the corner and edge features of images to increase the reliability and to speed up the process of dense matching of two uncalibrated images.
The major problem of classical area-based dense matching algorithms is the high computational time resulting from intensive correlation calculations for match candidates. Although some methods have attempted to integrate image feature information in the dense matching of uncalibrated images, most of these methods are not practical and are difficult to implement. This paper aims at designing a hybrid matching algorithm that preserves disparity continuity at the object continuous surfaces while discontinuity at object boundaries are treated differently in the matching process. In particular, both CPU-time and likelyhood of mismatches are reduced while the implementation is kept simple and straightforward.

s4.1

Title
Reconnaissance en-ligne de caracteres arabes manuscrits par un reseau de Kohonen
Authors
Neila Mezghani et Amar Mitiche
INRS-Telecommunications

Mohamed Cheriet
Ecole de Technologie Sup¨¦rieure
Abstract
Les r¨¦seaux neuronmim¨¦tiques ont fait preuve en classification de formes par leurs taux de reconnaissance ¨¦lev¨¦s et leurs temps de calcul faibles. Le but de cet te etude est de construire une m¨¦moire de Kohonen pour la reconnaissance en-ligne des caracteres arabes manuscrits. Des exp¨¦riences sont r¨¦alis¨¦es sur une base de donn¨¦es en-ligne que nous avons conçu pour cette fin et ont permi d'atteindre un taux de reconnaissance de 84.43%.

s4.2

Title
Improved Method of Handwritten Digit Recognition Teseted on MNIST Database
Authors
Ernst Kussul
UNAM, Centro de Instrumentos

Tatyana Baidyk
Unam, Centro de Instrumentos
Abstract
MNIST database serves for comparison of different methods of handwritten digit recognition. There are many data about different classifier recognition rate among which our neural classifier had the second place [1] (recognition rate 99,21%). At present we develop improvements of neural network structure and algorithms of handwritten digit recognition. Improved classifier has recognition rate 99,37%. This result is the best from the known ones. In this paper we briefly describe the general structure of our classifier and the latest improvements.

s4.3

Title
Combination of Decisions by Multiple Document Object Locators
Authors
Jung Soh
Visual Information Processing Team
Computer and Software Research Laboratory
Electronics and Telecommunications Research Institute
Daejeon, Korea
Abstract
This paper presents a method for combining multiple
document object locators tuned to different object characteristics,
with the goal of achieving location performance excelling
that of any individual locator.
The method includes
(i) a scheme for consistent representation of locator outputs regardless ofoutput levels,
(ii) the notion of object correspondence and their
applications to determining what decisions to combine,
(iii) a mechanism for representing knowledge of locators and
its usefor dynamic locator selection,
(iv) functions for combining confidence values of objects.
Results from experiments in postal address block location
using three locators and 1,100 envelope images are presented.

s4.4

Title
Simulating Eye Movement in Reading using Short-term Memory.
Authors
Satoru MORITA,
Faculty of Engineering, Yamaguchi University
Abstract
We propose the computation model of the eye movement based on the short-term memory. We also applied this model to simulating the human
eye movement in reading. We use the foveated vision that the resolution is high in the center of the retina and is low in the periphery of it to simulate human eye movement.

s4.5

Title
Road Sign Recognition by One Fixation of Space-Variant Sensor
Authors
D.G. Shaposhnikov, A.B.Kogan Research Institute for Neurocybernetics, Rostov State University

L.N. Podladchikova, A.B.Kogan Research Institute for Neurocybernetics, Rostov State University

A.V. Golovan, A.B.Kogan Research Institute for Neurocybernetics, Rostov State University

N.A. Shevtsova, A.B.Kogan Research Institute for Neurocybernetics, Rostov State University

K. Hong, School of Computing Science, Middlesex University

X.W. Gao, School of Computing Science, Middlesex University
Abstract
Biologically plausible model approach to solve the task of traffic sign detection and recognition invariantly to variable viewing conditions and results of model testing with British real world traffic signs are presented. The developed model for sign description and recognition by one fixation of a space-variant sensor simulates some mechanisms of the real visual system such as space-variant representation of information from the centre (the fovea) to the periphery of the retina, neuronal orientation selectivity, and context encoding of information. After consequent procedures of colour segmentation of initial real world images, classification according to sign colours and external forms, and determination of the centre of the inner informative sign part, 85% of potential traffic sign images were correctly identified for various weather conditions by one fixation of the developed space-variant sensor.

s4.6

Title
Simple Distances Between Handwritten Signatures
Authors
J.R. Parker
Laboratory for Computer Vision
Department of Computer Science
University of Calgary
Calgary, Alberta, Canada
parker@cpsc.ucalgary.ca
Abstract
When analyzing handwritten signatures using a
computer, a certain amount of variation within any par
ticular class is to be expected. Successful recognition
demands that this variation be less than that between
two different signatures. This paper describes three sim
ple ways to compare signatures that do not use any com
plicated or derivative feature measurements, each of
which defines a distance between signatures that allows
individual variation.

s4.7

Title
Hybrid system for recognition of handwritten symbols on the base of structural methods and neural networks
Authors
R. Sadykhov, prof., vice-president of Belarussian IAPR
O. Malenko, PhD,
A.N.Klimovich, PhD
M.L.Selinger, PhD
Abstract
The problem of handwritten symbol recognition has been investigated. The algorithm of approximation and breaks elimination has been developed. This approach allows to simplify the description of symbols and remove available errors. The method of structural recognition based on the description of a structure for handwritten symbols with the help of primitive sequences, robust to geometrical distortions peculiar to images has been used. In a combination with the specified method of the description the classifier based on comparison with the ideal image has been applied. The technique for the analysis and optimization of a choice of feature extraction algorithm and its parameters based on an estimation of clasterization quality of training sample with the help of self-organizing neural networks has been implemented. Computation algorithm of Legendre moments is presented. A new method for training RBF-neural network is represented. Classification results for binary images (handwritten Arabic numerals) are presented. On the base of classification results the recommendations for choice of maximal order Legendre moments and various classifiers are given.

s5.1

Title
Simultaneous Computation of Defocus Blur and Apparent Shifts in Spatial Domain
Authors
Francois Deschenes, Universite de Sherbrooke / Ecole des Mines de Paris
Djemel Ziou, Universite de Sherbrooke
Philippe Fuchs, Ecole des Mines de Paris
Abstract
This paper presents an algorithm for a cooperative and simultaneous estimation of depth cues: defocus blur and spatial shifts (stereo disparities, 2D~motion, and/or zooming disparities). These cues are estimated from two images of the same scene acquired by a camera evolving in time and/or space and for which the intrinsic parameters are known. This algorithm is based on generalized moment expansion. We show that the more blurred image may be expressed as a function of the partial derivatives of the two images, the blur difference and the horizontal and vertical shifts. Hence, these depth cues can be computed by resolving a system of equations. The proposed algorithm is tested using synthetic and real images. The results are dense and accurate. They confirm that defocus blur and spatial shifts can be simultaneously computed at a single scale.

s5.2

Title
Stable Recovery of Shape and Motion from Partially Tracked Feature Points with Fast Nonlinear Optimization
Authors
Akira Amano, Hiroshima City University
Tsuyoshi Migita, Hiroshima City University
Naoki Asada, Hiroshima City University
Abstract
In the shape from motion problem, nonlinear optimization method has
practical advantages in contrast with linear methods such as
factorization. Advantages are that it can handle partially tracked
feature points which are unavoidable in real situations, and that it can
handle strong perspective images. However, it has problems that the
resulting shape is not ensured to be global optimal and also the
calculation is relatively slow.
In this paper, we propose two effective methods to cope with these
problems. One is indirect search method which enables stable computation
of global optimal solution. The other is called PCG(Preconditioned
Conjugate Gradient) method which enables 3-9 times faster computation
without loss of accuracy compared to generally used Levenberg-Marquardt
method. Experimental results on real scene have shown the effectiveness
of our method.

s5.3

Title
Reconstructing Depth from Spatiotemporal Curves
Authors
     Rui Rodrigues, Universidade do Minho
     Ant¨®nio Ramires Fernandes, Universidade do Minho
     Kees  van Overveld, Philips Research
     Fabian Ernst, Philips Research
Abstract
We present a novel approach for 3D reconstruction based on multiple video frames taken from a static scene. Our solution emerges from the spatiotemporal analysis of video frames. The method is based on a best fitting scheme for spatiotemporal depth curves, which allows us to compute 3D world coordinates of the objects within the scene. As opposed to a large number of current methods, our technique deals with random camera movements in a transparent way, and even performs better in these cases than with pure translation. Robustness against occlusion, noise and aliasing is inherent to the method as well

s5.4

Title
An Iterative Method for Improving Bas-relief Ambiguity
Authors
Sung-Kee Park, Korea Institute of Science and Technology
Munsang Kim, Korea Institute of Science and Technology
In So Kweon, Korea Advanced Institute of Science and Technology
Abstract
The structure from motion, assuming small motion such as optical
flow and direct method, has inevitably the motion ambiguity
between translation and rotation. For solving it, conventional
methods adopt a single line methodology; first they try to find
exact corresponding points with only image brightness, and then
motion parameters and scene depths are estimated on the basis of it.
But, considering that the previous corresponding methods can have
large errors and it is difficult to define its error model,
the single line methods cannot improve such ambiguity although they
introduce any robust statistical estimator. Therefore, on the assumption
that corresponding points and motion estimation have to be iteratively
refined, we propose a new method for im-proving those ambiguities
with stereo image sequence.

s5.5

Title
Bayesian Real-Time Optical Flow
Authors
John S. Zelek
School of Engineering, University of Guelph
Abstract
Optical flow can be used to compute motion detection, time to collision,
structure, focus of expansion as well as object segmentation. Unfortunately,
most optical flow techniques do not provide accurate and dense measures
that are useful for these types of computations. In addition, most
techniques are also slow computationally. Albeit, one method proposed
by Camus is able to perform optical flow computations in real-time
capitalizing on redundancies in the computation and spatial-temporal
sampling trade-offs. It is a simple technique based on simulating
various motions and computing the SD (sum-difference) of patches.
Its problem is that the produced field is not accurate and arbitrary
in {\it aperture} and {\it blank wall} situations.
We show that the simulating of various futures are the
factored samples that produce the
likelihood probabilities that can be used in a particle filtering
framework. Maximization/minimization or computing the expectations
of the likelihood at a particular location does not necessarily
produce the proper flow. We suggest that likelihoods are well behaved
when their variance is small and these are propagated firstly to
address {\it aperture problems} and secondly to address the
{\it extended blank wall problem}. We show this propagation with
thresholded likelihood values and speculate on how the likelihood
distributions can be integrated into an algorithm that has its
basis in particle filtering.

s5.6

Title
A Thin Lens Based Camera Model for Depth Estimation from Blur and Translation by Zooming
Authors
Masashi Baba, Hiroshima City University
Naoki Asada, Hiroshima City University
Ai Oda, Hiroshima City University
Tsuyoshi Migita, Hiroshima City University
Abstract
Depth recovery is a central concern in computer vision, and many methods
were proposed for the monocular depth estimation by zooming as well as
focusing and irising. In the past, there are two distinct approaches in
depth by zooming; one is from motion parallax along the optical axis
using a pinhole camera model, and the other from image blur using a thin
lens camera model. This paper presents a new camera model that accounts
for both effects of image blur and lens center translation by zooming.
We first discuss the optical properties of zoom lenses, then present a
thin lens based camera model that describes the mutual relationship
between zoom, focus and iris parameters. Using this model with
calibration results, we have performed some experiments with real images
and evaluated the accuracy of the depth information recovered from image
blur and lens center translation. Experimental results have demonstrated
the validity of our camera model and also shown its applicability to the
depth estimation from blur and translation by zooming.

s5.7

Title
Retrieval of the Calibration Matrix from the 3-D Projective Camera Model
Authors
Gamal H. Seedahmed
and Toni Schenk
Photogrammetry Group
Dept. of Civil and Environmental Engineering and Geodetic Science
The Ohio State University
2070 Neil Avenue, Columbus OH 43210-1275 USA
{seedahmed.1 | schenk.2}@osu.edu

Abstract
By relating the projective camera model to the perspective one, the intrinsic camera parameters constitute what is called the calibration matrix. This paper presents two new methods to retrieve the calibration matrix from the projective camera model. In both methods, a collective approach was adopted, using matrix representation. The calibration matrix retrieved from a quadratic matrix term. The two methods framed around a correct utilization of Cholesky factorization to decompose the quadratic matrix term. The first method used an iterative Cholesky factorization to retrieve the calibration matrix from the quadratic matrix term. The second method used Cholesky factorization to factor the quadratic matrix term but after its inversion. The basic argument behind the two methods is that: the direct use of Cholesky factorization does not reveal the correct decomposition due to the missing matrix structure in terms of lower-upper ordering. In both methods, a successful retrieval of the calibration matrix achieved. This paper explains the key ideas behind the two methods, accommodated with a simulated example to demonstrate their validity.

s5.8

Title
Camera Calibration with a Viewfinder
Authors
Mohamed Bénallal, École des Mines de Paris, Paris, France
Jean Meunier, Université de Montréal, D.I.R.O.
Abstract
To answer the industrial need for simple camera calibration procedure, we propose a new method that requires a simple calibration object composed simply of a box and two crosses. The box is opened in the front where a large cross, made of wires, is attached while another is drawn (or attached) at the bottom. Both crosses are perfectly aligned similarly to a viewfinder. The viewfinder is first oriented with respect to camera such that the optical axis of the camera passes by the center of both crosses, allowing the display of a single (superimposed) cross and an immediate reading of the coordinates of the optical axis. Then, using the similar triangles theorem, the focal distance can be easily estimated. In addition, if necessary, the method can determine the orientation of the CCD matrix if it is not perfectly perpendicular to the optical axis by solving a simple linear system. This method should be particularly useful for calibration of cameras in situ, such as microscopes or embedded cameras.

s6.1

Title
person identification technique using human iris recognition
Authors
christel-loïc TISSE, AST-Rousset Lab. STMicroelectronics.

lionel TORRES, LIRMM Universite de Montpellier.

michel ROBERT, LIRMM Universite de Montpellier.
Abstract
The biometric person authentication technique based on the pattern of the human iris is well suited to be applied to any access control system requiring a high level of security. This paper examines a new iris recognition system that implements (i) gradient decomposed Hough transform / integrodifferential operators combination for iris localization and (ii) the "analytic image" concept (2D Hilbert transform) to extract pertinent information from iris texture. All these image-processing algorithms have been validated on noised real iris images database. The proposed innovative technique is computationally effective as well as reliable in terms of recognition rates.

s6.2

Title
N-Feature Neural Network Human Face Recognition
Authors
Javad Haddadnia, ECE dep., University of Windsor, Windsor, Ontario, Canada, N9B 3P4
Karim Faez, EE dep., Amirkabir University of Technology, Tehran, Iran, 15914
Majid Ahmadi, ECE dep., University of Windsor, Windsor, Ontario, Canada, N9B 3P4
Abstract
This paper introduces a novel method for human face recognition that employs a set of different kind of features from the face images with Radial Basis Function (RBF) neural network denoted the Hybrid N-Feature Neural Network (HNFNN) human face recognition system. The face image is projected in each appropriately selected transform methods in parallel. Experimental results for human face recognition confirm that the proposed method lends itself to higher classification accuracy relative to existing techniques.

s6.3

Title
Face Reconstruction From Shading Using Smooth Projected Polygon NN
Authors
Mohamad Ivan Fanany, Dept. of Computer Science, Graduate School of Science and Engineering Tokyo Institute of Technology

Masayoshi Ohno, Dept. of Computer Science, Graduate School of Science and Engineering Tokyo Institute of Technology

Itsuo Kumazawa, Dept. of Computer Science, Graduate School of Science and Engineering Tokyo Institute of Technology
Abstract
In this paper, we present a neural-network learning scheme for face reconstruction. This scheme, which we called as Smooth Projected Polygon Representation Neural Network (SPPRNN), is able to successively refine the polygon vertices parameter of an initial 3D shape based on depth maps of several calibrated images taken from multiple views. The depth maps, which are obtained by deploying Tsai-Shah shape from shading (SFS) algorithm, can be considered as partial 3D shapes of the face to be reconstructed. The reconstruction is finalized by mapping the texture of face images to the initial 3D shape. There are three interesting issues investigated in this paper concerning the effectiveness of this scheme. First, how effective the SFS provides partial 3D shapes compared to if we simply used 2D images. Secondly, it is essential to be able to generate a smooth projected polygonal model, in order to proximate the face structure and enhance the convergence rate of this scheme. Thirdly, how an appropriate initial 3D shape should be selected and used in order to improve model resolution and learning stability. By carefully addressing those three issues, it was shown from our preliminary result that a compact and realistic 3D model of human (mannequin) face could be obtained.

s6.4

Title
3D Head Models Retrieval Based on Hierarchical Facial Region Similarity
Authors
Horace H S Ip, Image Computing Group, Department of Computer Science, Centre for Innovative Applications of Internet and Multimedia Technologies (AIMtech Centre), City University of Hong Kong

William Y F Wong, Image Computing Group, Department of Computer Science, City University of Hong Kong
Abstract
This paper presents a technique for 3D head model retrieval. The approach combines a 3D shape representation scheme and hierarchical indexing of 3D models based on facial region similarity. The proposed shape similarity measure is based on comparing 3D model shape signatures computed from the Extended Gaussian Images of surface normal. The technique is made highly efficient and scalable by partitioning the 3D head model into distinctive facial regions and building a hierarchical index for the head model database. In our database, there are over 1,000 models and all the head models are represented up to about 3,000 surfaces. Furthermore, we have developed a novel user interface for specifying the visual queries and to interact with the retrieval system. By comparing with Eigenheads on the retrieval performance, we have shown that our approach performs similarly with Eigenheads but computationally more efficient by several orders of magnitudes. This makes our approach a practical solution for large model databases.

s6.5

Title
Intrinsic Filtering of Range Images Using a Physically Based Noise Model
Authors
Pierre Boulanger^{1}, Olli Jokinen^{2}, Angelo Beraldin^{3}
^{1}Department of Computing Science
University of Alberta, Edmonton, Alberta, Canada, T6G 2E8
^{2}Institute of Photogrammetry and Remote Sensing Helsinki
University of Technology P.O. Box 1200, FIN-02015 HUT, Finland
^{3}Institute for Information Technology, National Research Council, Canada, Ottawa, Ontario, K1A 0R6
Abstract
This paper presents a new multi-scales range data filtering
technique which produces a scale-space filtering analogous to
Gaussian filtering but has several interesting properties such as
viewpoint invariance and automatic edge preservation. One of the
main contribution of this paper is that it takes into account a
physical model of the sensor to ensure optimum filtering of the
signal. Using this filter, new algorithms can be developed to
detect at multi-scale depth and orientation discontinuities or
segment robustly range data based on the sign of Gaussian and mean
curvatures.

s6.6

Title
Pose Error Effects on Range Sensing
Authors
William R. Scott, Dept. of Electrical Engineering, University of Ottawa and National Research Council, Institute for Information Technology, Computational Video Group
Gerhard Roth, National Research Council, Institute for Information Technology, Computational Video Group
Jean-Francois Rivest, Dept. of Electrical Engineering, University of Ottawa
Abstract
Object reconstruction and inspection using a range camera requires a positioning system to configure relative sensor-object geometry in a sequence of poses. Discrepancies between commanded and actual poses can result in serious scanning deficiencies. This paper provides an analytical and experimental characterization of pose error effects for a common type of range camera.

s6.7

Title
Estimating Expansion Rates from Range Data Sequences
Authors
Hagen Spies, University of Heidelberg
John Barron, University of Western Ontario
Abstract
We present a method to compute surface expansion rates from
sequences of range data. Towards this end the 3D velocity field
(range flow) is extracted first and then used in a second step to
estimate the local area expansion. A detailed performance analysis
is presented and the method is applied on two real examples.

s7.1

Title
Detection and Tracking of Eyes for Gaze-camera Control
Authors
Shinjiro Kawato, ATR Media Information Science Laboratories
Nobuji Tetsutani, ATR Media Information Science Laboratories
Abstract
We propose new algorithms to extract and track the positions of eyes
in a real-time video stream. For extraction of eye positions, we
detect blinks based on the differences between successive
images. However eyelid regions are fairly small. We propose a method
to distinguish them from head movement. For eye position tracking,
we use an updating template based on a ``Between-the-Eyes'' pattern
instead of the eyes themselves. Eyes are searched based on the
current position of ``Between-the-Eyes'' and their geometrical
relations to the position in the previous frame. The
``Between-the-Eyes'' pattern is easier to locate accurately than eye
patterns. We implemented the system on a PC with a Pentium III
866MHz CPU. The system runs at 30 frames per second and robustly
detects and tracks the eyes.

s7.2

Title
Nouse `Use Your Nose as a Mouse' - a New Technology for Hands-free Games and Interfaces
Authors
D.O. Gorodnichy, Computational Video Group, IIT, NRC, Ottawa, Canada K1A 0R6
S. Malik, School of Computer Science, Carleton University, Ottawa, Canada, K1S
G. Roth, Computational Video Group, IIT, NRC, Ottawa, Canada K1A 0R6
Abstract
With the invention of fast USB interfaces and recent increase of computer
power and decrease of camera cost, it has become very common to see a camera on top of a
computer monitor. Vision-based games and interfaces however are still
not common, even despite the realization of the benefits vision could
bring: hand-free control, multiple-user interaction etc. The
reason for this lies in inability to track human faces in video both
precisely and robustly.
This paper describes a face tracking technique based on tracking a
convex-shape nose feature which resolves this problem.
The technique has been successfully applied to interactive
computer games and perceptual user interfaces. These results are presented.

s7.3

Title
Hand Shape Estimation Using Sequence of Multi-Ocular Images Based on Transition Network
Authors
Yasushi HAMADA, Dept.of Computer-Controlled Mechanical Systems, Osaka University
Nobutaka SHIMADA, Dept.of Computer-Controlled Mechanical Systems, Osaka University
Yoshiaki SHIRAI, Dept.of Computer-Controlled Mechanical Systems, Osaka University
Abstract
This paper presents a method of hand posture estimation from
silhouette images taken by multiple cameras.For each image,
we extract a feature vector from the silhouette contour of the
hand.We construct an eigenspace by the feature vectors extracted
from the hands of various postures.The feature vectors projected
into the eigenspace are registered as models.The matching criterion
of each images is defined as the distance to the model.The hand
shape is estimated by retrieving the registered model well-matching
to the input.For effective matching, we define a shape complexity
for each image to see how well the shape feature is represented.
For a set of input images taken by multiple cameras at each time,
the total matching criterion is evaluated by combining the matching criteria of the set of images using the shape complexities.
For rapid processing, we limit the matching candidate by using
the constraint on the shape change.The possible shape transition
is represented by a transition network.Because the network is hard
to build, we apply offline learning, where nodes and links are automatically created by showing examples of hand shape sequences.
We show experiments of building the transition networks and the performance of matching using the network.

s7.4

Title
Robust Face Detection and Japanese Sign Language Hand Posture Recognition for Human-Computer Interaction in an "Intelligent" Room
Authors
Jean-Christophe Terrillon, Arnaud Pilpre, Yoshinori Niwa,
Office of Regional Intensive Research Project (HOIP), Softopia Japan Foundation, 4-1-7 Kagano, Ogaki-City, Gifu 503-8569, Japan
{terrillon, pilpre, niwa}@softopia.pref.gifu.jp
Kazuhiko Yamamoto,
Faculty of Engineering, Gifu University,
1-1 Yanagido, Gifu-City, Gifu 501-1193, Japan
yamamoto@info.gifu-u.ac.jp
Abstract
A system for the detection of human faces and for the classification of hand postures of the Japanese Sign Language in color images inside an "intelligent" room is presented. We first propose to apply a combination of a skin chrominance-based image segmentation with a color vector gradient-based edge detection [1] [2] to efficiently detect faces and hands. Within the framework of a general approach, a statistical model for face detection based on invariant moments [3] [4] is used to discriminate between faces and hands in the segmented images. A novel approach to hand posture recognition based on phase-only correlation [5] is then adopted to classify a subset of static hand postures of the Japanese Sign Language, each posture representing a given phoneme, and also to discriminate between hand postures and the image scene background. Experiments show that the additional use of the color vector gradient significantly improves the correct rate of face detection, and that the phase-only correlation filter yields a high rate of discrimination between different static hand postures as well as between hand postures and the scene background. Ultimately, the system is to contribute to the implementation of meaningful human-machine interactions in a room that we are in the process of establishing, the percept-room, mainly for welfare applications.

s7.5

Title
Generation of Arm-gesture and Facial Expression for Intelligent Avatar Communications in the Internet
Authors
Sang-Woon Kim, School of Computer Science, Carleton University, Canada
Young-Who Lee, Div. of Computer Science and Engineering, Myongji University, Korea
Yoshinao Aoki, Graduate School of Engineering, Hokkaido University, Japan

Abstract
Recently the sign-language communication systems between avatars of different languages have been investigated as a means of overcoming the linguistic barrier. In the systems, an intelligent communication method has been employed, where sets of the animation parameters such as the joint angles of the gesture were transmitted instead of sending the entire-real motion pictures.
However, the communication has been done based on the gesture only without considering the facial expression.
In this paper we conduct an experiment on extracting the animation parameters of the facial expression as well as the arm-gesture, and generating them on various avatar models.
To extract the parameters to be transmitted, three kinds of key-frame editors are designed using techniques of inverse kinematics and partial differential equations. In generating facial expression especially, the movements of the cheeks and the jaws as well as other facial components are also implemented.
The simulation results show a possibility that the method could be used as a useful means for avatar communications between different languages in the Internet cyberspace.

s7.6

Title
Affordable 3D Face Tracking Using Projective Vision
Authors
D.O. Gorodnichy,  S. Malik and G. Roth 
Computational Video Group, National Research Council, Ottawa, Canada K1A 0R6 
Abstract
For humans, to view a scene with two eyes is clearly more
advantageous than to do that with one eye. In computer vision however,
most of high-level vision tasks, an example of which is face tracking,
are still done with one camera only. This is due to the fact that,
unlike in human brains, the relationship between the images observed
by two arbitrary video cameras, in many cases, is not known. Recent
advances in projective vision theory however have produced the methodology
which allows one to compute this
relationship. This relationship is naturally obtained while observing
the same scene with both cameras and knowing this relationship not only makes it
possible to track features in 3D, but also makes tracking much more
robust and precise. In this paper, we establish a framework based on
projective vision for tracking faces in 3D using two arbitrary
cameras, and describe a stereo tracking system, which uses the
proposed framework to track faces in 3D with the aid of two USB cameras. While being very affordable, our stereotracker exhibits pixel size precision and is robust to head's rotation in all three axis of rotation.

s7.7

Title
Extraction of Hand Features for Recognition of Sign Language Words
Authors
Nobuhiko Tanibata, Dept. of Computer-controlled Mechanical Sys. , Osaka University.
Nobutaka Shimada, Dept. of Computer-controlled Mechanical Sys. , Osaka University.
Yoshiaki Shirai, Dept. of Computer-controlled Mechanical Sys. , Osaka University.
Abstract
This paper proposes a method to obtain hand features from sequences of images, where a person is performing the Japanese Sign Language (JSL) in a complex background and to recognize the JSL word.

At the first frame, we find a person's region, and then search for a face, hands in order to determine a range of skin color and search for elbows to determine the position of a wrist.

At each frame, we track the face, the hands by using the decided skin color and track the elbows by matching the template of a elbow shape. When face and hands overlap, they are extracted by matching texture templates of the previous face and hands. Hand features such as the hand direction, the number of fingers, etc. are extracted from the hand regions and the wrist.

In order to recognize JSL words, we use a sequence of the hand features as an input to HMM. We first select words which reach the final state of HMM, and then determine one with the highest probability. We made an experiment with real images of a professional JSL interpreter and recognized 65 JSL words successfully.

s7.8

Title
Robust Corner Tracking for Real-time Augmented Reality
Authors
Shahzad Malik, Gerhard Roth, Chris McDonald, Computational Video Group, IIT, NRC, Ottawa, Canada K1A 0R6 
Abstract
Vision-based registration techniques for augmented reality systems have been the subject of intensive research recently due to their potential to accurately align virtual objects with the real world. The downfall of these vision-based approaches, however, is their high computational cost and lack of robustness. This paper describes the implementation of a fast, but accurate, vision-based corner tracker that forms the basis of a pattern-based augmented reality system. The tracker predicts corner positions by computing a homography between known corner positions on a planar pattern and potential planar regions in a video sequence. Local search windows are then placed around these predicted locations in order to find the actual subpixel corner positions. Experimental results show the robustness of the corner tracking system with respect to occlusion, scale, orientation, and lighting.

s8.1

Title
     Algorithme G¨¦n¨¦tique et Crit¨¨re de la Trace pour l'Optimisation du Vecteur Attribut : Application ¨¤ la  
     Classification Supervis¨¦e des Images de Textures
Authors
M. NASRI
M. EL HITMY
Abstract
La s¨¦lection des param¨¨tres est une proc¨¦dure tr¨¨s d¨¦licate pour la classification. Nous pr¨¦sentons dans cet article une nouvelle m¨¦thode bas¨¦e sur une approche g¨¦n¨¦tique qui optimise le choix des param¨¨tres par la minimisation d'une fonction coût. La fonction coût est choisie d'apr¨¨s le crit¨¨re de la Trace. Cette approche est valid¨¦e sur des images de textures. L'algorithme propos¨¦ converge rapidement vers la solution optimale.

s8.2

Title
Geometrical and Topological Informations for Image Segmentation with Monte Carlo Markov Chain Implementation
Authors
P. Bourdon, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
O. Alata, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
G. Damiand, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
C. Olivier, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
Y. Bertrand, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
Abstract
The image segmentation methods based on Markovian assumption consist
in optimizing a Gibbs energy function which depends on the observation field
and the segmented field. This energy function can be represented as a
sum of potentials defined on cliques which are subsets of the grid of
sites. The Potts model is the most commonly used to represent the
segmented field. However, this model expressed just a potential on
the classes for nearest neighbor pixels.
In this paper, we propose the integration of global informations, like
the size of a region, in
the local potentials of the Gibbs energy. To extract these
informations, we use a
representation model well known in geometric modeling: the topological
map.
Results on synthetic and natural images are provided showing
improvements in the obtained segmented fields.

s8.3

Title
Anisotropic Diffusion by a Recursive Linear Convolving Method: Application to Space-time Segmentation and Pattern Recognition.
Authors
Ph. D. Santiago Venegas Martinez (Student: Universite Paris V)
Ph. D. Juan Manuel Rendon Mancha (Student: Universite Paris V)
Ph. D. Georges Stamon (Profesor : Universite Paris V))
Abstract
This paper presents a recursive linear convolving method to perform
anisotropic diffusion in images. The proposed method based on the linear filtering technique gives an useful evolving interface in
boundary propagating. The novel approach is that there is not need
to estimate local and global properties previously of the concerned
propagating boundary, it making a fast algorithm. However these
properties can be obtained directly from the evolving interface in our
proposed method. Since the proposed method, formulated in the continuous space, can be implemented efficiently and with robustness in the discrete space, we propose practical applications like
space-time segmentation and pattern recognition.

s8.4

Title
Histogram Characterization of Colored textures Using One-Dimensional Moments and Chromaticity Diagram
Authors
Jacques BROCHARD, Majdi KHOUDEIR
Laboratoire IRCOM-SIC, UMR-CNRS 6615
Bvd Marie et Pierre Curie, BP 30179
86962 Futuroscope Chasseneuil Cedex, France
Abstract
We develop a new histogram characterization method of colored textures, in order to recognize and classify them. This approach is simultaneously based on the use of the chromaticity diagram and the one-dimensional geometric moments. In the chromaticity diagram, we calculate the 1D distribution of wavelengths and purity factors. Then, color signature of image is done by means of 1D geometric moments computed on these two histograms and on gray-level one. Texture classification with images of database ''marbleandgranite.com'' is performed and confirms the validity of this approach.

s8.5

Title
Granulometry Using Transformation Based Techniques
Authors
Andrzej Zadorozny, Hong Zhang, Martin Jagersand
Affiliation:
Department of Computing Science
University of Alberta
Edmonton, Alberta Canada T6G 2E1
Abstract
In this paper, we present our study on using transformation-based
techniques for performing granulometry analysis. We
examine specifically the problem of the particle size analy-sis
in oil sand images. In contrast to conventional methods
of size analysis, we avoid the difficult step of image segmen-tation
and derive the size distribution of the particles in the
images directly by the transformation techniques of Fourier
analysis and scale-space decomposition. We have tested our
techniques on both simulated artificial data and real video
images and demonstrated the feasibility of the proposed ap-proaches.

s8.6

Title
A New Fast and Robust Circle Extraction Algorithm
Authors
Euijin KIM, Miki HASEYAME, and Hideo KITAJIMA
School of Engineering Hokkaido University
N-13, W-8, Kita-ku, Sapporo 060-0812, Japan
Abstract
This paper presents a fast algorithm that is capable of extracting circles from complicated and heavily corrupted images. The algorithm uses a least-squares fitting algorithm for arc segments. The arcs are segmented by using the short straight lines which are extracted by a fast line extraction algorithm. The arc segments are used to yield accurate circle parameters.Tests performed on synthetic and real-world images show that the algorithm quickly and accurately extracts circles from complicated and heavily corrupted images.

In.1

Title
Geometry and Statistics of Visual Space-Time
Authors
Cornelia Fermueller, Patrick Baker, Yiannis Aloimonos
Computer Vision Laboratory and Center for Automation Research at the University of Maryland, College Park, MD 20742-3275, USA.
Abstract
Although the fundamental ideas underlying research ef-forts in the field of computer vision have not radically changed in the past two decades, there has been a trans-formation in the way work in this field is conducted. This is primarily due to the emergence of a number of tools, of both a practical and a theoretical nature. One such tool, celebrated throughout the nineties, is the geometry of visual space-time. It is known under a variety of headings, such as multiple view geometry, structure from motion, and model building. It is a mathematical theory relating multiple views (images) of a scene taken at different viewpoints to three-dimensional models of the (possibly dynamic) scene. This mathematical theory gave rise to algorithms that take as input images (or video) and provide as output a model of the scene. Such algorithms are one of the biggest successes of the field and they have many applications in other disci-plines, such as graphics. One of the difficulties, however, is that the current tools cannot yet be fully automated, and they do not provide very accurate results. More research is required for automation and high precision. During the past few years we have investigated a number of basic questions underlying the structure from motion problem. Our investi-gations resulted in a small number of principles that char-acterize the problem. These principles, which give rise to automatic procedures and point to new avenues for study-ing the next level of the structure from motion problem, are the subject of this paper.

In.2

Title
TLA Based Face Tracking
Authors
Matthew Turk, Changbo Hu, Rogerio Feris, Farshid Lashkari, Andy Beall
Computer Science Department ‡ Psychology Department University of California, Santa Barbara, CA 93106
Abstract
Human face tracking (HFT) is one of several technologies useful in vision-based interaction (VBI), which is one of several technologies useful in the broader area of perceptual user interfaces (PUI). In this paper we motivate our interests in PUI and VBI, and describe our recent efforts in various aspects of face tracking in the Interaction Lab at UCSB. The HFT methods (GWN, EHT, and CFD), in the context of VBI and PUI, are part of an overall “TLA approach” to face tracking.

In.3

Title
OpenCV: Examples of Use and New Applications in Stereo, Recognition and Tracking
Authors
Gary R Bradski, Ph.D. Mgr. Vision, Graphics and Pattern Recognition Group Intel Labs Intel, Email: gary.bradski@intel.com
Abstract
The Open Source Computer Vision Library (OpenCV) is an open source, free for research AND commercial use, computer vision library started by Intel, now an open community effort with Intel contractors responsible for code, documentation and bug fix integration as well as official builds. OpenCV is written in C/C++ but will automatically load optimized assembly language dynamic libraries included with the code. It now also has a Matlab interface for convienient use with research code. See http://www.intel.com/research/mrl/research/opencv/ for information on how to get the code and/or join the user group. In this talk, we overview the library content, give usage examples in C and Matlab and then present some new work supporting stereo vision, face recognition, face feature tracking and Audio-Visual speech recognition.
Bio
Gary graduated from U.C. Berkeley and after seven years of traveling, working and consulting went back to graduate school to get a Ph.D. from the Cognitive and Neural System's Department at Boston University (http://cns-web.bu.edu/) studying mathematical models of biological memory and vision. Through a sequence of events he still cannot fathom, he ended up as a quantitative analyst developing interest rate option pricing models on the options trading floor of First Union National Bank. He happily returned to vision and pattern recognition research a few years later in Intel Labs where he initiated OpenCV among other efforts. He is currently manager of the Vision, Graphics and Pattern Recognition group at Intel Labs. He has a wife and 2 children, takes advantage of the great hiking around Silicon Valley, loves Volleyball when he's not injured and runs to stay sane (perhaps not quite fast enough).

Copyright © 2002  Vision Interface