Résumés des Articles / Paper Abstracts
s1.1
-
- Title
- An Adaptive-Sampling Algorithm for Object Representation
- Authors
- Robert Alterson, Dept. of Computer Science, York University
Minas
Spetsakis, Dept. of Computer Science, York University
- Abstract
- We present a novel adaptive-sampling algorithm for spectral signature generation. This algorithm is designed to increase inter-object
discrimination and reduce feature-vector dimensionality. Our algorithm is applied to a
Gabor-feature based multi-resolutional object
detection and recognition scheme. In this context we study and analyze the detection and identification of unknown objects in a complex
background. Iterative, off-line optimization methods are employed to reduce computational
demands during the learning phase. Our representation scheme takes into account all items in a given object
library. It selects sample-point sets that maximize the inter-object distance. Thus, the presented method increases identification
robustness and can reduce the size of signature vectors.
s1.2
-
- Title
- Region-Based Image Retrieval using Wavelet Transform
- Authors
- Nobuo Suematsu, Faculty of Information Sciences Hiroshima City
University
Yoshihiro Ishida, Fuji-film Software Corp. Akira Hayashi,
Faculty of Information Sciences Hiroshima City University Toshihiko
Kanbara, Faculty of Information Sciences Hiroshima City University
- Abstract
- Content-based image retrieval, which provides convenient ways to
retrieve images from large image databases, has been studied actively.
While many previous image retrieval techniques do not look at
regions in an image, region-based image retrieval techniques have
been gaining attention recently. We propose a region-based image
retrieval method which performs image segmentation and indexing using
texture features computed from wavelet coefficients. The proposed
method has advantages in texture feature extraction and hierarchical
image segmentation over the previous region-based techniques using
wavelet transform.
-
s1.3
-
- Title
- Semantics Retrieval by Content and Context of Image Regions
- Authors
- Wei Wang, Yuqing Song and Aidong Zhang
Department of Computer
Science and Engineering State University of New York at Buffalo
Buffalo, NY 14260 USA
- Abstract
- We propose a novel approach for semantics retrieval from images in
multimedia databases. In our approach, we use color-texture
classification to generate the codebook which is used to segment
images into regions. The content of a region describes the lower-level
features of the region, including color and texture. The context of
regions in an image describes their relationships in the image. The
content and context of image regions provide a way for semantics
retrieval. On top of semantics retrieval, high-level
(semantics-based) querying and query-by-example are supported. The
experimental results demonstrate that our approach outperforms the
traditional CBIR approaches.
-
s1.4
-
- Title
- Visual Abstraction of Wildlife Footage using Gaussian Mixture Models
- Authors
- David Gibson, University of Bristol
Neill Campbell, University of
Bristol Barry Thomas, University of Bristol
- Abstract
- In this paper, we present a novel approach for clip-based key
frame
extraction. Our framework allows both clips with subtle changes as well
as clips containing rapid shot changes, fades and dissolves to be well
approximated. We show that creating key frame video abstractions can be
achieved by transforming each frame of a video sequence into an
eigenspace and then clustering this space using Gaussian Mixture Models
(GMMs). An iterative process computes a GMM configuration that best
clusters the data based on a maximum likelihood threshold. The image
nearest to the centres of each of the GMM components are selected as key
frames. Unlike previous work this technique relies on global video clip
properties and results show that the key frames extracted give a very
good representation of the overall clip content. We show that, by using
a single threshold, an operator can easily control the number of
representative key frames generated. We also demonstrate that clustering
in eigen-time space improves the video abstractions in a quantifiable
manner and we demonstrate the application of this technique on a
database of 307 clips of wildlife footage containing dissolves, shot
changes, fades, pans, zooms and a wide range of animal behaviours.
-
s1.5
-
- Title
- Motion Estimation by Object-Matching for Real-time Object-Based Video
Representation
- Authors
- Aishy Amer: INRS-T¨¦l¨¦communications
Amar Mitiche: INRS-T¨¦l¨¦communications
Eric Dubois: Universit¨¦ d'Ottawa
- Abstract
- Motion estimation plays a key role in many video applications, such as
frame-rate video conversion, video retrieval, video surveillance, and video
compression.
The key issue in these applications is to define
appropriate representations that can efficiently support motion estimation
with the required accuracy. In this paper, a low-complexity object
motion estimation technique is proposed that is designed to fit the needs of
content-based video representation for video surveillance and retrieval
applications. In these applications, a representation of object motion in a
way meaningful for high-level interpretation, such as event detection and
classification, foregoes precision of estimation. The proposed method relies
on the estimation of the displacements of the minimum bounding box (MBB)
sides of an object. Two motion estimation steps are proposed: initial
coarse estimation to find a single displacement for an object using the four
sides of the MBB between two successive images and detection of non-translational motion and its estimation. The result is the detection of
the type of object motion and the subsequent estimation of one or more
motion values per object depending on the detected motion type. Special
consideration is given to object motion in interlaced video and at image
margin. Various simulations show that the proposed method provides a
response in real-time and gives good estimates to use for object tracking,
event detection, and high-level video representation.
-
s1.6
-
- Title
- Hierarchical Indexing Images Using Weighted Low Dimensional Texture
Features
- Authors
- Mira Park, University of NSW, Australia
Jesse S. Jin, University of
Sydney, Australia Laurence S. Wilson, CSIRO, Australia
- Abstract
- This paper introduces a new method to analyse the image texture and to
index the image database. We present a new strategy to reduce the
computational time to extract image features with high retrieval accuracy.
We also propose a method to reduce the image feature dimension, so any
robust indexing methods can be used. By weighting the extracted image
features, a system may perceive the image consistent with human perception.
We use two spaces to keep the key images and the candidates images for an
efficient indexing of the image database.
-
s1.7
-
- Title
- Generation of Images of Historical Documents by Composition of their
Components
- Authors
- Carlos A.B. Mello, UFPE
Rafael D. Lins, UFPE
- Abstract
- This paper describes a system for processing of historical documents.
First, documents are decomposed into their components (such as paper
texture, colours, typewritten parts, handwritten parts, pictures, etc). This
decomposition process allows for efficient storage, indexing and network
transmission, by factoring out common components. Document retrieval forces
the re-assembling of the document, synthetising an image visually close to
the original document. The information needed to build the final image
occupies, in average, 2 Kbytes performing a very efficient compression
scheme.
-
s2.1
-
- Title
- Motion Segmentation and Tracking
- Authors
- King Yuen Wong, Dept. of Computer Science, York University
Minas E.
Spetsakis, Dept. of Computer Science, York University
- Abstract
- This paper presents a novel motion segmentation and tracking
algorithm. The tracking is done by fitting successively more elaborate
models of optical flow on the tracked region and the new tracke
region is computed by a fast sum of squared differences statistic on
the previous images properly aligned. The method can track objects in
image sequences with moving background, taken by a hand-held camera,
tolerate up to 30 pixels interframe motion and takes 0.3 second per
frame pair of size 320 x 240 pixels on a 500 Mhz Sun Blade 100
workstation.
-
s2.2
-
- Title
- Automated Visual Surveillance Using Hidden Markov Models
- Authors
- Vinod Nair, Center for Intelligent Machines, McGill University, Montreal
PQ H3A 2A7
James J. Clark, Center for Intelligent Machines, McGill
University, Montreal PQ H3A 2A7
- Abstract
- This paper describes an automated visual surveillance system that
detects suspicious human activity in a scene. The system is designed to: 1)
detect and track people in the scene, 2) recognize the "normal" activities
in the scene, and 3) detect anomalous activity by finding sufficiently large
deviations from the normal activity patterns. The stochastic time-sequence
recognition framework of the Hidden Markov Model (HMM) forms the basis of
activity recognition and anomaly detection. We have implemented the system
to monitor an office corridor in real-time using a Pentium III machine
running Windows 2000. The results show that the system correctly classifies
examples of normal activities in the corridor and identifies a mock break-in
attempt as suspicious activity.
-
s2.3
-
- Title
- Real-Time Tracking for Visual Interface Applications in Cluttered and
Occluding Situations
- Authors
- David J. Bullock, University of Guelph
John S. Zelek, University of
Guelph
- Abstract
- Visual interface systems require object tracking techniques with
real-time performance for ubiquitous interaction. A probabilistic framework
for a visual tracking system which robustly tracks targets in real-time
using color and motion cues is presented. The algorithm is based on particle
filtering techniques of the ICondensation filter. An innovation of the paper
is the use of motion cues to guide the propagation of particle samples which
are being evaluated using color cues. This results in a probabilistic blob
tracking method which is shown to greatly outperform conventional blob
trackers when in the presence of occlusion and clutter. The technique is
applied to the task of video annotation using a hand-held marking device.
-
s2.4
-
- Title
- Multi-object Motion Pattern Classification for Visual Surveillance and
Sports Video Retrieval
- Authors
- Akira Hayashi, Faculty of Information Sciences Hiroshima City University
Ryuji Nakasima, Faculty of Information Sciences Hiroshima City
University Toshihiko Kanbara, Faculty of Information Sciences Hiroshima
City University Nobuo Suematsu, Faculty of Information Sciences
Hiroshima City University
- Abstract
- We present a scene activity pattern learning and recognition method
for visual surveillance and video retrieval. While scene activity
patterns are represented in terms of object trajectories in previous
work, they are represented in terms of the instantaneous motions of
multiple objects in each image in our method. In order to deal with
variable number of objects in a scene, we propose to use moment
statistics as features. Our approach is based on clustering, a form of
unsupervised learning, and needs little human intervention.
Furthermore, the probabilistic model based clustering makes it easy to
detect abnormality.
-
s2.5
-
- Title
- A Novel Probability Model for Background Maintenance and Subtraction
- Authors
- Dongsheng Wang , National Laboratory of Pattern Recognition,Chinese
Academy of Science.
Tao Feng , Microsoft Research Asia Harry Shum ,
Microsoft Research Asia Songde Ma , National Laboratory of Pattern
Recognition,Chinese Academy of Science.
- Abstract
- ¡°Background maintenance and subtraction¡± is a common element of many
computer vision applications. This paper introduces a novel model for
background. This model includes two components and it processes the video
sequence at pixel level and frame level alternatively. The advantage of this
model is that it can capture both the temporal and spatial context of the
video sequence. At pixel level, any probability model for pixel process can
be used. And at frame level, we use Markov Random Field. Then, for a
particular condition ¨C video surveillance on freeway, we propose a new
pixel level model¨Cadaptive HMM. For HMM, both offline and online learning
algorithm are discussed. For MRF, we discussed Belief Propagation algorithm.
In our experiments about the video surveillance on freeway, the model can
solve the problems encountered: bootstrapping, gradual change of
illumination, and it can detect both moving vehicles and shadows.
Finally, we give our views about background maintenance and subtraction
derived from our practice.
-
s2.6
-
- Title
- Robot Navigation Using Panoramic Landmark Tracking
- Authors
- Mark Fiala, Department of Computing Science, University of Alberta
Dr. Anup Basu, Department of Computing Science, University of Alberta
- Abstract
- A vision based navigation system is presented for determining a mobile
robot's position and orientation using panoramic imagery. Omni-directional
sensors are useful in obtaining a 360 degree field of
view, permitting
objects in the vicinity of a robot to be imaged simultaneously. Recognizing
landmarks in a panoramic image from an a priori model of distinct features
in an environment allows a robot's location information to be updated. A
system is shown for tracking vertex and line features for omni-directional
cameras constructed with catadioptric (containing both mirrors and lenses)
optics. With the aid of the Panoramic Hough Transform, line features can be
tracked without restricting the mirror geometry to that which satisfies the
single viewpoint criteria. Two paradigms for landmark tracking are explored,
with experiments shown with synthetic and real images ed. A working
implementation on a mobile robot is shown.
-
s2.7
-
- Title
- Wavelet Based Resolution Enhancement of Omnidirectional Image
- Authors
- Peng Qimin
Jia Yunde Computer Science Department, Beijing
Institute of Technology
- Abstract
- This paper proposes an approach for resolution
enhancement of
omnidirectional images based on wavelet transform. The resolution
enhancement of an image is achieved by using local extrema
extrapolation of wavelet coefficients under a degradation model. The
fusion operation is applied to the coefficients of registered pixels in
the enhanced images of an image sequence. A fine resolution
enhancement image reconstructed via inverse wavelet transform is
discussed. The experimental results show the proposed approach is
feasible and efficient for resolution enhancement of omnidirectional
image.
-
s2.8
-
- Title
- The Detection of Obstacles Using Features by the Horizon View Camera
- Authors
- Ayami Iwata, Gifu University
Kunihito Kato, Gifu University
Kazuhiko Yamamoto, Gifu University
- Abstract
- In this paper, we propose a new camera system called Horizon View Camera
(HVC). The HVC is a system in which the optical axis of a camera is directed
at the horizon with a mirror so that obtained image contains objects on the
ground without including the ground itself. Therefore, by using the HVC
system, separating objects from the ground becomes very easy. Moreover,
there are many other useful features in the HVC system. In order to improve
the processing speed and accuracy, we propose a new idea whereby the
detection of objects becomes easier and the results are more
accurate.
-
s3.1
-
- Title
- An Empirical Study of Some Feature Matching Strategies
- Authors
- Etienne Vincent, S.I.T.E., University of Ottawa
Robert Laganiere,
S.I.T.E., University of Ottawa
- Abstract
- Several algorithms are proposed in the literature to solve the difficult
problem of feature point correspondence between images. These methods make
use of different properties of image point pairs,
in order to improve
the quality of the matching. This paper proposes an empirical evaluation
of their performance, and presents some new matching constraints. The
validation process used here determines the number of good matches and
the proportion of good matches in a given match set.
-
s3.2
-
- Title
- A Fast Area-Based Stereo Matching Algorithm
- Authors
- Luigi Di Stefano
DEIS, University of Bologna
Massimiliano
Marchionni DEIS, University of Bologna
Stefano Mattoccia
DEIS, University of Bologna
Giovanni Neri DEIS, University
of Bologna
- Abstract
- This paper presents an area-based stereo algorithm suitable to real time
applications. The core of the algorithm relies on the uniqueness constraint
and on a matching process that allows for rejecting previous matches as soon
as more reliable ones are found. In this manner unreliable disparity
measurements are detected and discarded. The proposed approach is compared
with the matching process based on the left-right consistency constraint,
being the latter the basic method for detecting unreliable matches in many
area-based stereo algorithms. The algorithm has been carefully optimised to
obtain a very fast implementation on a Personal Computer. This paper
describes the computational optimisation strategy, which is based on a very
effective incremental calculation scheme yielding massive elimination of
redundant calculations. Finally, we provide experimental results obtained on
stereo pairs with ground-truth as well as computation-time measurements; we
compare these data with those obtained using a well-know, fast,area-based
algorithm relying on the left-right consistency constraint.
-
s3.3
-
- Title
- Three-dimensional structure calculation: achieving accuracy without
calibration
- Authors
- B. Boufama
School of Computer Science University of Windsor
Windsor, On., Canada N9B 3P4
- Abstract
- This paper addresses the problem of computing the camera motion and the
three-dimensional structure of a scene using two uncalibrated images as
inputs. The camera motion is calculated by estimating the essential
matrix and using approximate values, easily available, for the intrinsic
parameters. The classical eight-point algorithm to calculate the
essential matrix is known to be very sensitive to pixel-noise even when the
intrinsic parameters are perfectly known. This paper shows that by using the
normalized eight-point algorithm, aimed at calculating the fundamental
matrix, the pixel-noise sensitivity is reduced significantly. More
importantly, we show that the intrinsic parameters do not have to be
accurately known in order to get very good quality reconstruction. In
particular, we have investigated and compared the effect of errors on the
intrinsic parameters together with pixel-noise on the calculated
motion/structure, when using the straightforward eight-point algorithm
and its normalized version respectively.
-
S3.4
-
- Title
- A Stereo Confidence Metric Using Single View Imagery
- Authors
- Geoffrey Egnal, GRASP Laboratory, University of Pennsylvania
Max
Mintz, GRASP Laboratory, University of Pennsylvania Richard P. Wildes,
Centre for Vision Research, York University
- Abstract
- Although stereo vision research has progressed remarkably, stereo
systems still need a fast, accurate way to estimate confidence in
their output. In the current paper, we explore using stereo
performance on two different images from a single view as a confidence
measure for a binocular stereo system incorporating that single view.
Although it seems counterintuitive to search for correspondence in two
different images from the same view, such a search gives us precise
quantitative performance data. Correspondences significantly far from
the same location are erroneous because there is little to no motion
between the two images. Using hand-generated ground truth, we
quantitatively compare this new confidence metric with five commonly
used confidence metrics on a uniform basis. We explore the performance
characteristics of each metric under a variety of conditions.
-
s3.5
-
- Title
- Limiting the Search Range of Correlation Stereo Using Silhouettes
- Authors
- Geoffrey Egnal, GRASP Laboratory, University of Pennsylvania
Max
Mintz, GRASP Laboratory, University of Pennsylvania Kostas Daniilidis,
GRASP Laboratory, University of Pennsylvania
- Abstract
- We combine two basic approaches to 3D reconstruction: silhouette-based
and correspondence-based approaches. The two approaches have
complementary costs and benefits. Silhouette-based approaches deliver
volumetric descriptions which often have very few outliers, but they
cannot reconstruct concave surfaces. Correspondence-based approaches
give surface descriptions with sub-pixel accuracy, but their search
range either allows outliers or falls short of the correct match. We
show that a combination of the two can deliver fine-grained accuracy
with few outliers. Our specific implementation uses the silhouette
reconstruction as prior data to center and bound a stereo search
process. We explore the different performance characteristics of the
three methods qualitatively and quantitatively using real
imagery.
-
s3.6
-
- Title
- Towards a fast and reliable dense matching algorithm
- Authors
- K. Jin and B. Boufama
University of Windsor
- Abstract
- In this paper, we present a dense matching algorithm which utilizes
the corner and edge features of images to increase the reliability and
to speed up the process of dense matching of two uncalibrated images.
The major problem of classical area-based dense matching algorithms is
the high computational time resulting from intensive correlation
calculations for match candidates. Although some methods have attempted to
integrate image feature information in the dense matching of uncalibrated
images, most of these methods are not practical and are difficult to
implement. This paper aims at designing a hybrid matching algorithm that
preserves disparity continuity at the object continuous surfaces while
discontinuity at object boundaries are treated differently in the matching
process. In particular, both CPU-time and likelyhood of mismatches are
reduced while the implementation is kept simple and straightforward.
-
s4.1
-
- Title
- Reconnaissance en-ligne de caracteres arabes manuscrits par un reseau de Kohonen
- Authors
- Neila Mezghani et Amar Mitiche
INRS-Telecommunications
Mohamed Cheriet
Ecole de Technologie Sup¨¦rieure
- Abstract
- Les r¨¦seaux neuronmim¨¦tiques ont fait preuve en classification de formes
par leurs taux de reconnaissance ¨¦lev¨¦s et leurs temps de calcul faibles. Le but de
cet
te etude est de construire une m¨¦moire de Kohonen pour la reconnaissance en-ligne des caracteres arabes manuscrits. Des exp¨¦riences sont r¨¦alis¨¦es sur une
base
de donn¨¦es en-ligne que nous avons conçu pour cette fin et ont permi
d'atteindre un taux de reconnaissance de 84.43%.
s4.2
-
- Title
- Improved Method of Handwritten Digit Recognition Teseted on MNIST
Database
- Authors
- Ernst Kussul
UNAM, Centro de Instrumentos
Tatyana Baidyk
Unam, Centro de Instrumentos
- Abstract
- MNIST database serves for comparison of different methods of handwritten
digit recognition. There are many data about different classifier
recognition rate among which our neural classifier had the second place [1]
(recognition rate 99,21%). At present we develop improvements of neural
network structure and algorithms of handwritten digit recognition. Improved
classifier has recognition rate 99,37%. This result is the best from the
known ones. In this paper we briefly describe the general structure of our
classifier and the latest improvements.
-
s4.3
-
- Title
- Combination of Decisions by Multiple Document Object Locators
- Authors
- Jung Soh
Visual Information Processing Team Computer and
Software Research Laboratory Electronics and Telecommunications Research
Institute Daejeon, Korea
- Abstract
- This paper presents a method for combining multiple
document object
locators tuned to different object characteristics, with the goal of
achieving location performance excelling that of any individual locator.
The method includes (i) a scheme for consistent representation of
locator outputs regardless ofoutput levels, (ii) the notion of object
correspondence and their applications to determining what decisions to
combine, (iii) a mechanism for representing knowledge of locators and
its usefor dynamic locator selection, (iv) functions for combining
confidence values of objects. Results from experiments in postal address
block location using three locators and 1,100 envelope images are
presented.
-
s4.4
-
- Title
- Simulating Eye Movement in Reading using Short-term Memory.
- Authors
- Satoru MORITA,
Faculty of Engineering, Yamaguchi University
- Abstract
- We propose the computation model of the eye movement based on the
short-term memory. We also applied this model to simulating the human
eye movement in reading. We use the foveated vision that the resolution
is high in the center of the retina and is low in the periphery of it to
simulate human eye movement.
-
s4.5
-
- Title
- Road Sign Recognition by One Fixation of Space-Variant Sensor
- Authors
- D.G. Shaposhnikov, A.B.Kogan Research Institute for Neurocybernetics,
Rostov State University
L.N. Podladchikova, A.B.Kogan Research
Institute for Neurocybernetics, Rostov State University
A.V.
Golovan, A.B.Kogan Research Institute for Neurocybernetics, Rostov State
University
N.A. Shevtsova, A.B.Kogan Research Institute for
Neurocybernetics, Rostov State University
K. Hong, School of
Computing Science, Middlesex University
X.W. Gao, School of
Computing Science, Middlesex University
- Abstract
- Biologically plausible model approach to solve the task of traffic sign
detection and recognition invariantly to variable viewing conditions and
results of model testing with British real world traffic signs are
presented. The developed model for sign description and recognition by one
fixation of a space-variant sensor simulates some mechanisms of the real
visual system such as space-variant representation of information from the
centre (the fovea) to the periphery of the retina, neuronal orientation
selectivity, and context encoding of information. After consequent
procedures of colour segmentation of initial real world images,
classification according to sign colours and external forms, and
determination of the centre of the inner informative sign part, 85% of
potential traffic sign images were correctly identified for various weather
conditions by one fixation of the developed space-variant sensor.
-
s4.6
-
- Title
- Simple Distances Between Handwritten Signatures
- Authors
- J.R. Parker
Laboratory for Computer Vision Department of
Computer Science University of Calgary Calgary, Alberta, Canada
parker@cpsc.ucalgary.ca
- Abstract
- When analyzing handwritten signatures using a
computer, a certain
amount of variation within any par ticular class is to be expected.
Successful recognition demands that this variation be less than that
between two different signatures. This paper describes three sim ple
ways to compare signatures that do not use any com plicated or
derivative feature measurements, each of which defines a distance
between signatures that allows individual variation.
-
s4.7
-
- Title
- Hybrid system for recognition of handwritten symbols on the base of
structural methods and neural networks
- Authors
- R. Sadykhov, prof., vice-president of Belarussian IAPR
O. Malenko,
PhD, A.N.Klimovich, PhD M.L.Selinger, PhD
- Abstract
- The problem of handwritten symbol recognition has been investigated. The
algorithm of approximation and breaks elimination has been developed. This
approach allows to simplify the description of symbols and remove available
errors. The method of structural recognition based on the description of a
structure for handwritten symbols with the help of primitive sequences,
robust to geometrical distortions peculiar to images has been used. In a
combination with the specified method of the description the classifier
based on comparison with the ideal image has been applied. The technique for
the analysis and optimization of a choice of feature extraction algorithm
and its parameters based on an estimation of clasterization quality of
training sample with the help of self-organizing neural networks has been
implemented. Computation algorithm of Legendre moments is presented. A new
method for training RBF-neural network is represented. Classification
results for binary images (handwritten Arabic numerals) are presented. On
the base of classification results the recommendations for choice of maximal
order Legendre moments and various classifiers are given.
-
s5.1
-
- Title
- Simultaneous Computation of Defocus Blur and Apparent Shifts in Spatial
Domain
- Authors
- Francois Deschenes, Universite de Sherbrooke / Ecole des Mines de Paris
Djemel Ziou, Universite de Sherbrooke Philippe Fuchs, Ecole des
Mines de Paris
- Abstract
- This paper presents an algorithm for a cooperative and simultaneous
estimation of depth cues: defocus blur and spatial shifts (stereo
disparities, 2D~motion, and/or zooming disparities). These cues are
estimated from two images of the same scene acquired by a camera evolving in
time and/or space and for which the intrinsic parameters are known. This
algorithm is based on generalized moment expansion. We show that the more
blurred image may be expressed as a function of the partial derivatives of
the two images, the blur difference and the horizontal and vertical shifts.
Hence, these depth cues can be computed by resolving a system of equations.
The proposed algorithm is tested using synthetic and real images. The
results are dense and accurate. They confirm that defocus blur and spatial
shifts can be simultaneously computed at a single scale.
-
s5.2
-
- Title
- Stable Recovery of Shape and Motion from Partially Tracked Feature
Points with Fast Nonlinear Optimization
- Authors
- Akira Amano, Hiroshima City University
Tsuyoshi Migita, Hiroshima
City University Naoki Asada, Hiroshima City University
- Abstract
- In the shape from motion problem, nonlinear optimization method has
practical advantages in contrast with linear methods such as
factorization. Advantages are that it can handle partially tracked
feature points which are unavoidable in real situations, and that it can
handle strong perspective images. However, it has problems that the
resulting shape is not ensured to be global optimal and also the
calculation is relatively slow. In this paper, we propose two
effective methods to cope with these problems. One is indirect search
method which enables stable computation of global optimal solution. The
other is called PCG(Preconditioned Conjugate Gradient) method which
enables 3-9 times faster computation without loss of accuracy compared
to generally used Levenberg-Marquardt method. Experimental results on
real scene have shown the effectiveness of our method.
-
s5.3
-
- Title
- Reconstructing Depth from Spatiotemporal Curves
- Authors
- Rui Rodrigues, Universidade do Minho
Ant¨®nio Ramires Fernandes, Universidade do Minho
Kees van Overveld, Philips Research
Fabian Ernst, Philips Research
- Abstract
- We present a novel approach for 3D reconstruction based on multiple
video frames taken from a static scene. Our solution emerges from the
spatiotemporal analysis of video frames. The method is based on a best
fitting scheme for spatiotemporal depth curves, which allows us to compute
3D world coordinates of the objects within the scene. As opposed to a large
number of current methods, our technique deals with random camera movements
in a transparent way, and even performs better in these cases than with pure
translation. Robustness against occlusion, noise and aliasing is inherent to
the method as well
-
s5.4
-
- Title
- An Iterative Method for Improving Bas-relief Ambiguity
- Authors
- Sung-Kee Park, Korea Institute of Science and Technology
Munsang
Kim, Korea Institute of Science and Technology In So Kweon, Korea
Advanced Institute of Science and Technology
- Abstract
- The structure from motion, assuming small motion such as optical
flow and direct method, has inevitably the motion ambiguity between
translation and rotation. For solving it, conventional methods adopt a
single line methodology; first they try to find exact corresponding
points with only image brightness, and then motion parameters and scene
depths are estimated on the basis of it. But, considering that the
previous corresponding methods can have large errors and it is difficult
to define its error model, the single line methods cannot improve such
ambiguity although they introduce any robust statistical estimator.
Therefore, on the assumption that corresponding points and motion
estimation have to be iteratively refined, we propose a new method for
im-proving those ambiguities with stereo image sequence.
-
s5.5
-
- Title
- Bayesian Real-Time Optical Flow
- Authors
- John S. Zelek
School of Engineering, University of Guelph
- Abstract
- Optical flow can be used to compute motion detection, time to collision,
structure, focus of expansion as well as object segmentation.
Unfortunately, most optical flow techniques do not provide accurate and
dense measures that are useful for these types of computations. In
addition, most techniques are also slow computationally. Albeit, one
method proposed by Camus is able to perform optical flow computations in
real-time capitalizing on redundancies in the computation and
spatial-temporal sampling trade-offs. It is a simple technique based on
simulating various motions and computing the SD (sum-difference) of
patches. Its problem is that the produced field is not accurate and
arbitrary in {\it aperture} and {\it blank wall} situations. We show
that the simulating of various futures are the factored samples that
produce the likelihood probabilities that can be used in a particle
filtering framework. Maximization/minimization or computing the
expectations of the likelihood at a particular location does not
necessarily produce the proper flow. We suggest that likelihoods are
well behaved when their variance is small and these are propagated
firstly to address {\it aperture problems} and secondly to address the
{\it extended blank wall problem}. We show this propagation with
thresholded likelihood values and speculate on how the likelihood
distributions can be integrated into an algorithm that has its basis
in particle filtering.
-
s5.6
-
- Title
- A Thin Lens Based Camera Model for Depth Estimation from Blur and
Translation by Zooming
- Authors
- Masashi Baba, Hiroshima City University
Naoki Asada, Hiroshima City
University Ai Oda, Hiroshima City University Tsuyoshi Migita,
Hiroshima City University
- Abstract
- Depth recovery is a central concern in computer vision, and many methods
were proposed for the monocular depth estimation by zooming as well as
focusing and irising. In the past, there are two distinct approaches in
depth by zooming; one is from motion parallax along the optical axis
using a pinhole camera model, and the other from image blur using a thin
lens camera model. This paper presents a new camera model that accounts
for both effects of image blur and lens center translation by zooming.
We first discuss the optical properties of zoom lenses, then present a
thin lens based camera model that describes the mutual relationship
between zoom, focus and iris parameters. Using this model with
calibration results, we have performed some experiments with real images
and evaluated the accuracy of the depth information recovered from image
blur and lens center translation. Experimental results have demonstrated
the validity of our camera model and also shown its applicability to the
depth estimation from blur and translation by zooming.
-
s5.7
-
- Title
- Retrieval of the Calibration Matrix from the 3-D Projective Camera Model
- Authors
- Gamal H. Seedahmed
and Toni Schenk Photogrammetry Group
Dept. of Civil and Environmental Engineering and Geodetic Science
The Ohio State University 2070 Neil Avenue, Columbus OH 43210-1275
USA {seedahmed.1 | schenk.2}@osu.edu
- Abstract
- By relating the projective camera model to the perspective one, the
intrinsic camera parameters constitute what is called the calibration
matrix. This paper presents two new methods to retrieve the calibration
matrix from the projective camera model. In both methods, a collective
approach was adopted, using matrix representation. The calibration matrix
retrieved from a quadratic matrix term. The two methods framed around a
correct utilization of Cholesky factorization to decompose the quadratic
matrix term. The first method used an iterative Cholesky factorization to
retrieve the calibration matrix from the quadratic matrix term. The second
method used Cholesky factorization to factor the quadratic matrix term but
after its inversion. The basic argument behind the two methods is that: the
direct use of Cholesky factorization does not reveal the correct
decomposition due to the missing matrix structure in terms of lower-upper
ordering. In both methods, a successful retrieval of the calibration matrix
achieved. This paper explains the key ideas behind the two methods,
accommodated with a simulated example to demonstrate their validity.
-
s5.8
-
- Title
- Camera Calibration with a Viewfinder
- Authors
- Mohamed Bénallal, École des Mines de Paris, Paris, France
Jean Meunier, Université de Montréal, D.I.R.O.
- Abstract
- To answer the industrial need for simple camera
calibration procedure, we propose a new method that
requires a simple calibration object composed simply of
a box and two crosses. The box is opened in the front
where a large cross, made of wires, is attached while
another is drawn (or attached) at the bottom. Both
crosses are perfectly aligned similarly to a viewfinder.
The viewfinder is first oriented with respect to camera
such that the optical axis of the camera passes by the
center of both crosses, allowing the display of a single
(superimposed) cross and an immediate reading of the
coordinates of the optical axis. Then, using the similar
triangles theorem, the focal distance can be easily
estimated. In addition, if necessary, the method can
determine the orientation of the CCD matrix if it is not
perfectly perpendicular to the optical axis by solving a
simple linear system. This method should be particularly
useful for calibration of cameras in situ, such as
microscopes or embedded cameras.
-
s6.1
-
- Title
- person identification technique using human iris recognition
- Authors
- christel-loïc TISSE, AST-Rousset Lab. STMicroelectronics.
lionel
TORRES, LIRMM Universite de Montpellier.
michel ROBERT, LIRMM
Universite de Montpellier.
- Abstract
- The biometric person authentication technique based on the pattern of
the human iris is well suited to be applied to any access control system
requiring a high level of security. This paper examines a new iris
recognition system that implements (i) gradient decomposed Hough transform /
integrodifferential operators combination for iris localization and (ii)
the "analytic image" concept (2D Hilbert transform) to extract pertinent
information from iris texture. All these image-processing algorithms have
been validated on noised real iris images database. The proposed innovative
technique is computationally effective as well as reliable in terms of
recognition rates.
-
s6.2
-
- Title
- N-Feature Neural Network Human Face Recognition
- Authors
- Javad Haddadnia, ECE dep., University of Windsor, Windsor, Ontario,
Canada, N9B 3P4
Karim Faez, EE dep., Amirkabir University of Technology,
Tehran, Iran, 15914 Majid Ahmadi, ECE dep., University of Windsor,
Windsor, Ontario, Canada, N9B 3P4
- Abstract
- This paper introduces a novel method for human face recognition that
employs a set of different kind of features from the face images with Radial
Basis Function (RBF) neural network denoted the Hybrid N-Feature Neural
Network (HNFNN) human face recognition system. The face image is projected
in each appropriately selected transform methods in parallel. Experimental
results for human face recognition confirm that the proposed method lends
itself to higher classification accuracy relative to existing
techniques.
-
s6.3
-
- Title
- Face Reconstruction From Shading Using Smooth Projected Polygon NN
- Authors
- Mohamad Ivan Fanany, Dept. of Computer Science, Graduate School of
Science and Engineering Tokyo Institute of Technology
Masayoshi
Ohno, Dept. of Computer Science, Graduate School of Science and Engineering
Tokyo Institute of Technology
Itsuo Kumazawa, Dept. of Computer
Science, Graduate School of Science and Engineering Tokyo Institute of
Technology
- Abstract
- In this paper, we present a neural-network learning scheme for face
reconstruction. This scheme, which we called as Smooth Projected Polygon
Representation Neural Network (SPPRNN), is able to successively refine the
polygon vertices parameter of an initial 3D shape based on depth maps of
several calibrated images taken from multiple views. The depth maps, which
are obtained by deploying Tsai-Shah shape from shading (SFS) algorithm, can
be considered as partial 3D shapes of the face to be reconstructed. The
reconstruction is finalized by mapping the texture of face images to the
initial 3D shape. There are three interesting issues investigated in this
paper concerning the effectiveness of this scheme. First, how effective the
SFS provides partial 3D shapes compared to if we simply used 2D images.
Secondly, it is essential to be able to generate a smooth projected
polygonal model, in order to proximate the face structure and enhance the
convergence rate of this scheme. Thirdly, how an appropriate initial 3D
shape should be selected and used in order to improve model resolution and
learning stability. By carefully addressing those three issues, it was shown
from our preliminary result that a compact and realistic 3D model of human
(mannequin) face could be obtained.
-
s6.4
-
- Title
- 3D Head Models Retrieval Based on Hierarchical Facial Region Similarity
- Authors
- Horace H S Ip, Image Computing Group, Department of Computer Science,
Centre for Innovative Applications of Internet and Multimedia Technologies
(AIMtech Centre), City University of Hong Kong
William Y F Wong,
Image Computing Group, Department of Computer Science, City University of
Hong Kong
- Abstract
- This paper presents a technique for 3D head model retrieval. The
approach combines a 3D shape representation scheme and hierarchical indexing
of 3D models based on facial region similarity. The proposed shape
similarity measure is based on comparing 3D model shape signatures computed
from the Extended Gaussian Images of surface normal. The technique is made
highly efficient and scalable by partitioning the 3D head model into
distinctive facial regions and building a hierarchical index for the head
model database. In our database, there are over 1,000 models and all the
head models are represented up to about 3,000 surfaces. Furthermore, we have
developed a novel user interface for specifying the visual queries and to
interact with the retrieval system. By comparing with Eigenheads on the
retrieval performance, we have shown that our approach performs similarly
with Eigenheads but computationally more efficient by several orders of
magnitudes. This makes our approach a practical solution for large model
databases.
s6.5
-
- Title
- Intrinsic Filtering of Range Images Using a Physically Based Noise Model
- Authors
- Pierre Boulanger^{1}, Olli Jokinen^{2}, Angelo Beraldin^{3}
^{1}Department of Computing Science University of Alberta,
Edmonton, Alberta, Canada, T6G 2E8 ^{2}Institute of Photogrammetry
and Remote Sensing Helsinki University of Technology P.O. Box 1200,
FIN-02015 HUT, Finland ^{3}Institute for Information
Technology, National Research Council, Canada, Ottawa, Ontario, K1A 0R6
- Abstract
- This paper presents a new multi-scales range data filtering
technique which produces a scale-space filtering analogous to
Gaussian filtering but has several interesting properties such as
viewpoint invariance and automatic edge preservation. One of the
main contribution of this paper is that it takes into account a
physical model of the sensor to ensure optimum filtering of the
signal. Using this filter, new algorithms can be developed to detect
at multi-scale depth and orientation discontinuities or segment robustly
range data based on the sign of Gaussian and mean curvatures.
-
s6.6
-
- Title
- Pose Error Effects on Range Sensing
- Authors
- William R. Scott, Dept. of Electrical Engineering, University of Ottawa
and National Research Council, Institute for Information Technology,
Computational Video Group
Gerhard Roth, National Research Council,
Institute for Information Technology, Computational Video Group
Jean-Francois Rivest, Dept. of Electrical Engineering, University of
Ottawa
- Abstract
- Object reconstruction and inspection using a range camera requires a
positioning system to configure relative sensor-object geometry in a
sequence of poses. Discrepancies between commanded and actual poses can
result in serious scanning deficiencies. This paper provides an analytical
and experimental characterization of pose error effects for a common type of
range camera.
-
s6.7
-
- Title
- Estimating Expansion Rates from Range Data Sequences
- Authors
- Hagen Spies, University of Heidelberg
John Barron, University of
Western Ontario
- Abstract
- We present a method to compute surface expansion rates from
sequences of range data. Towards this end the 3D velocity field
(range flow) is extracted first and then used in a second step to
estimate the local area expansion. A detailed performance analysis
is presented and the method is applied on two real examples.
-
s7.1
-
- Title
- Detection and Tracking of Eyes for Gaze-camera Control
- Authors
- Shinjiro Kawato, ATR Media Information Science Laboratories
Nobuji
Tetsutani, ATR Media Information Science Laboratories
- Abstract
- We propose new algorithms to extract and track the positions of eyes
in a real-time video stream. For extraction of eye positions, we
detect blinks based on the differences between successive images.
However eyelid regions are fairly small. We propose a method to
distinguish them from head movement. For eye position tracking, we use
an updating template based on a ``Between-the-Eyes'' pattern instead of
the eyes themselves. Eyes are searched based on the current position of
``Between-the-Eyes'' and their geometrical relations to the position in
the previous frame. The ``Between-the-Eyes'' pattern is easier to locate
accurately than eye patterns. We implemented the system on a PC with a
Pentium III 866MHz CPU. The system runs at 30 frames per second and
robustly detects and tracks the eyes.
s7.2
-
- Title
- Nouse `Use Your Nose as a Mouse' - a New Technology for
Hands-free Games and Interfaces
- Authors
- D.O. Gorodnichy, Computational Video Group, IIT, NRC, Ottawa, Canada K1A
0R6
S. Malik, School of Computer Science, Carleton University,
Ottawa, Canada, K1S G. Roth, Computational Video Group, IIT, NRC,
Ottawa, Canada K1A 0R6
- Abstract
- With the invention of fast USB interfaces and recent increase of
computer
power and decrease of camera cost, it has become very common to
see a camera on top of a computer monitor. Vision-based games and
interfaces however are still not common, even despite the realization of
the benefits vision could bring: hand-free control, multiple-user
interaction etc. The reason for this lies in inability to track human
faces in video both precisely and robustly. This paper describes a
face tracking technique based on tracking a convex-shape nose feature
which resolves this problem. The technique has been successfully applied
to interactive computer games and perceptual user interfaces. These
results are presented.
s7.3
-
- Title
- Hand Shape Estimation Using Sequence of Multi-Ocular Images Based on
Transition Network
- Authors
- Yasushi HAMADA, Dept.of Computer-Controlled Mechanical Systems, Osaka
University
Nobutaka SHIMADA, Dept.of Computer-Controlled Mechanical
Systems, Osaka University Yoshiaki SHIRAI, Dept.of Computer-Controlled
Mechanical Systems, Osaka University
- Abstract
- This paper presents a method of hand posture estimation from
silhouette images taken by multiple cameras.For each image, we
extract a feature vector from the silhouette contour of the hand.We
construct an eigenspace by the feature vectors extracted from the hands
of various postures.The feature vectors projected into the eigenspace
are registered as models.The matching criterion of each images is
defined as the distance to the model.The hand shape is estimated by
retrieving the registered model well-matching to the input.For effective
matching, we define a shape complexity for each image to see how well
the shape feature is represented. For a set of input images taken by
multiple cameras at each time, the total matching criterion is evaluated
by combining the matching criteria of the set of images using the shape
complexities. For rapid processing, we limit the matching candidate by
using the constraint on the shape change.The possible shape transition
is represented by a transition network.Because the network is hard
to build, we apply offline learning, where nodes and links are
automatically created by showing examples of hand shape sequences. We
show experiments of building the transition networks and the performance of
matching using the network.
-
s7.4
-
- Title
- Robust Face Detection and Japanese Sign Language Hand Posture
Recognition for Human-Computer Interaction in an "Intelligent" Room
- Authors
- Jean-Christophe Terrillon, Arnaud Pilpre, Yoshinori Niwa,
Office of
Regional Intensive Research Project (HOIP), Softopia Japan Foundation, 4-1-7
Kagano, Ogaki-City, Gifu 503-8569, Japan {terrillon, pilpre,
niwa}@softopia.pref.gifu.jp Kazuhiko Yamamoto, Faculty of
Engineering, Gifu University, 1-1 Yanagido, Gifu-City, Gifu 501-1193,
Japan yamamoto@info.gifu-u.ac.jp
- Abstract
- A system for the detection of human faces and for the classification of
hand postures of the Japanese Sign Language in color images inside an
"intelligent" room is presented. We first propose to apply a combination of
a skin chrominance-based image segmentation with a color vector
gradient-based edge detection [1] [2] to efficiently detect faces and hands.
Within the framework of a general approach, a statistical model for face
detection based on invariant moments [3] [4] is used to discriminate between
faces and hands in the segmented images. A novel approach to hand posture
recognition based on phase-only correlation [5] is then adopted to classify
a subset of static hand postures of the Japanese Sign Language, each posture
representing a given phoneme, and also to discriminate between hand postures
and the image scene background. Experiments show that the additional use of
the color vector gradient significantly improves the correct rate of face
detection, and that the phase-only correlation filter yields a high rate of
discrimination between different static hand postures as well as between
hand postures and the scene background. Ultimately, the system is to
contribute to the implementation of meaningful human-machine interactions in
a room that we are in the process of establishing, the percept-room, mainly
for welfare applications.
s7.5
-
- Title
- Generation of Arm-gesture and Facial Expression for Intelligent Avatar
Communications in the Internet
- Authors
- Sang-Woon Kim, School of Computer Science, Carleton University, Canada
Young-Who Lee, Div. of Computer Science and Engineering, Myongji
University, Korea Yoshinao Aoki, Graduate School of Engineering,
Hokkaido University, Japan
- Abstract
- Recently the sign-language communication systems between avatars of
different languages have been investigated as a means of overcoming the
linguistic barrier. In the systems, an intelligent communication method has
been employed, where sets of the animation parameters such as the joint
angles of the gesture were transmitted instead of sending the entire-real
motion pictures.
However, the communication has been done based on the
gesture only without considering the facial expression. In this paper we
conduct an experiment on extracting the animation parameters of the facial
expression as well as the arm-gesture, and generating them on various avatar
models. To extract the parameters to be transmitted, three kinds of
key-frame editors are designed using techniques of inverse kinematics and
partial differential equations. In generating facial expression especially,
the movements of the cheeks and the jaws as well as other facial components
are also implemented. The simulation results show a possibility that the
method could be used as a useful means for avatar communications between
different languages in the Internet cyberspace.
-
s7.6
-
- Title
- Affordable 3D Face Tracking Using Projective Vision
- Authors
- D.O. Gorodnichy, S. Malik and G. Roth
Computational Video Group, National Research Council, Ottawa, Canada K1A
0R6
- Abstract
- For humans, to view a scene with two eyes is clearly more
advantageous than to do that with one eye. In computer vision however,
most of high-level vision tasks, an example of which is face tracking,
are still done with one camera only. This is due to the fact that,
unlike in human brains, the relationship between the images observed
by two arbitrary video cameras, in many cases, is not known. Recent
advances in projective vision theory however have produced the
methodology which allows one to compute this relationship. This
relationship is naturally obtained while observing the same scene with
both cameras and knowing this relationship not only makes it possible to
track features in 3D, but also makes tracking much more robust and
precise. In this paper, we establish a framework based on projective
vision for tracking faces in 3D using two arbitrary cameras, and
describe a stereo tracking system, which uses the proposed framework to
track faces in 3D with the aid of two USB cameras. While being very
affordable, our stereotracker exhibits pixel size precision and is robust to
head's rotation in all three axis of rotation.
-
s7.7
-
- Title
- Extraction of Hand Features for Recognition of Sign Language Words
- Authors
- Nobuhiko Tanibata, Dept. of Computer-controlled Mechanical Sys. , Osaka
University.
Nobutaka Shimada, Dept. of Computer-controlled Mechanical
Sys. , Osaka University. Yoshiaki Shirai, Dept. of Computer-controlled
Mechanical Sys. , Osaka University.
- Abstract
- This paper proposes a method to obtain hand features from sequences of
images, where a person is performing the Japanese Sign Language (JSL) in a
complex background and to recognize the JSL word.
At the first
frame, we find a person's region, and then search for a face, hands in order
to determine a range of skin color and search for elbows to determine the
position of a wrist.
At each frame, we track the face, the hands by
using the decided skin color and track the elbows by matching the template
of a elbow shape. When face and hands overlap, they are extracted by
matching texture templates of the previous face and hands. Hand features
such as the hand direction, the number of fingers, etc. are extracted from
the hand regions and the wrist.
In order to recognize JSL words, we
use a sequence of the hand features as an input to HMM. We first select
words which reach the final state of HMM, and then determine one with the
highest probability. We made an experiment with real images of a
professional JSL interpreter and recognized 65 JSL words
successfully.
-
s7.8
-
- Title
- Robust Corner Tracking for Real-time Augmented Reality
- Authors
-
Shahzad Malik, Gerhard Roth, Chris McDonald, Computational Video Group, IIT, NRC, Ottawa, Canada K1A
0R6
- Abstract
- Vision-based registration techniques for augmented reality
systems have been the subject of intensive research recently
due to their potential to accurately align virtual objects
with the real world. The downfall of these vision-based
approaches, however, is their high computational cost and
lack of robustness.
This paper describes the implementation of a fast, but
accurate, vision-based corner tracker that forms the basis
of a pattern-based augmented reality system. The tracker
predicts corner positions by computing a homography
between known corner positions on a planar pattern and
potential planar regions in a video sequence. Local search
windows are then placed around these predicted locations
in order to find the actual subpixel corner positions.
Experimental results show the robustness of the corner
tracking system with respect to occlusion, scale,
orientation, and lighting.
s8.1
-
- Title
- Algorithme G¨¦n¨¦tique et Crit¨¨re de la Trace
pour l'Optimisation du Vecteur Attribut : Application ¨¤ la
- Classification Supervis¨¦e des Images de
Textures
- Authors
- M. NASRI
M. EL HITMY
- Abstract
- La s¨¦lection des param¨¨tres est une proc¨¦dure tr¨¨s d¨¦licate pour la
classification. Nous pr¨¦sentons dans cet article une nouvelle m¨¦thode bas¨¦e
sur une approche g¨¦n¨¦tique qui optimise le choix des param¨¨tres par la
minimisation d'une fonction coût. La fonction coût est choisie d'apr¨¨s le
crit¨¨re de la Trace. Cette approche est valid¨¦e sur des images de textures.
L'algorithme propos¨¦ converge rapidement vers la solution optimale.
-
s8.2
-
- Title
- Geometrical and Topological Informations for Image Segmentation with
Monte Carlo Markov Chain Implementation
- Authors
- P. Bourdon, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
O. Alata, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
G. Damiand, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
C. Olivier, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
Y. Bertrand, Laboratoire SIC - Universit¨¦ de Poitiers - FRANCE
- Abstract
- The image segmentation methods based on Markovian assumption consist
in optimizing a Gibbs energy function which depends on the observation
field and the segmented field. This energy function can be represented
as a sum of potentials defined on cliques which are subsets of the grid
of sites. The Potts model is the most commonly used to represent the
segmented field. However, this model expressed just a potential on
the classes for nearest neighbor pixels. In this paper, we propose
the integration of global informations, like the size of a region, in
the local potentials of the Gibbs energy. To extract these
informations, we use a representation model well known in geometric
modeling: the topological map. Results on synthetic and natural
images are provided showing improvements in the obtained segmented
fields.
-
s8.3
-
- Title
- Anisotropic Diffusion by a Recursive Linear Convolving Method:
Application to Space-time Segmentation and Pattern Recognition.
- Authors
- Ph. D. Santiago Venegas Martinez (Student: Universite Paris V)
Ph.
D. Juan Manuel Rendon Mancha (Student: Universite Paris V) Ph. D.
Georges Stamon (Profesor : Universite Paris V))
- Abstract
- This paper presents a recursive linear convolving method to perform
anisotropic diffusion in images. The proposed method based on the linear
filtering technique gives an useful evolving interface in boundary
propagating. The novel approach is that there is not need to estimate
local and global properties previously of the concerned propagating
boundary, it making a fast algorithm. However these properties can be
obtained directly from the evolving interface in our proposed method.
Since the proposed method, formulated in the continuous space, can be
implemented efficiently and with robustness in the discrete space, we
propose practical applications like space-time segmentation and pattern
recognition.
-
s8.4
-
- Title
- Histogram Characterization of Colored textures Using One-Dimensional
Moments and Chromaticity Diagram
- Authors
- Jacques BROCHARD, Majdi KHOUDEIR
Laboratoire IRCOM-SIC, UMR-CNRS
6615 Bvd Marie et Pierre Curie, BP 30179 86962 Futuroscope
Chasseneuil Cedex, France
- Abstract
- We develop a new histogram characterization method of colored textures,
in order to recognize and classify them. This approach is simultaneously
based on the use of the chromaticity diagram and the one-dimensional
geometric moments. In the chromaticity diagram, we calculate the 1D
distribution of wavelengths and purity factors. Then, color signature of
image is done by means of 1D geometric moments computed on these two
histograms and on gray-level one. Texture classification with images of
database ''marbleandgranite.com'' is performed and confirms the validity of
this approach.
-
s8.5
-
- Title
- Granulometry Using Transformation Based Techniques
- Authors
- Andrzej Zadorozny, Hong Zhang, Martin Jagersand
Affiliation:
Department of Computing Science University of Alberta Edmonton,
Alberta Canada T6G 2E1
- Abstract
- In this paper, we present our study on using transformation-based
techniques for performing granulometry analysis. We examine
specifically the problem of the particle size analy-sis in oil sand
images. In contrast to conventional methods of size analysis, we avoid
the difficult step of image segmen-tation and derive the size
distribution of the particles in the images directly by the
transformation techniques of Fourier analysis and scale-space
decomposition. We have tested our techniques on both simulated
artificial data and real video images and demonstrated the feasibility
of the proposed ap-proaches.
-
s8.6
-
- Title
- A New Fast and Robust Circle Extraction Algorithm
- Authors
- Euijin KIM, Miki HASEYAME, and Hideo KITAJIMA
School of Engineering
Hokkaido University N-13, W-8, Kita-ku, Sapporo 060-0812, Japan
- Abstract
- This paper presents a fast algorithm that is capable of extracting
circles from complicated and heavily corrupted images. The algorithm uses a
least-squares fitting algorithm for arc segments. The arcs are segmented by
using the short straight lines which are extracted by a fast line extraction
algorithm. The arc segments are used to yield accurate circle
parameters.Tests performed on synthetic and real-world images show that the
algorithm quickly and accurately extracts circles from complicated and
heavily corrupted images.
-
In.1
-
- Title
- Geometry and Statistics of Visual Space-Time
- Authors
- Cornelia Fermueller, Patrick Baker, Yiannis Aloimonos
Computer Vision Laboratory and Center for
Automation Research at the University of Maryland, College Park, MD
20742-3275, USA.
- Abstract
- Although the fundamental ideas underlying research ef-forts
in the field of computer vision have not radically
changed in the past two decades, there has been a trans-formation
in the way work in this field is conducted. This
is primarily due to the emergence of a number of tools, of
both a practical and a theoretical nature. One such tool,
celebrated throughout the nineties, is the geometry of visual
space-time. It is known under a variety of headings, such as
multiple view geometry, structure from motion, and model
building. It is a mathematical theory relating multiple
views (images) of a scene taken at different viewpoints to
three-dimensional models of the (possibly dynamic) scene.
This mathematical theory gave rise to algorithms that take
as input images (or video) and provide as output a model of
the scene. Such algorithms are one of the biggest successes
of the field and they have many applications in other disci-plines,
such as graphics. One of the difficulties, however,
is that the current tools cannot yet be fully automated, and
they do not provide very accurate results. More research is
required for automation and high precision. During the past
few years we have investigated a number of basic questions
underlying the structure from motion problem. Our investi-gations
resulted in a small number of principles that char-acterize
the problem. These principles, which give rise to
automatic procedures and point to new avenues for study-ing
the next level of the structure from motion problem, are
the subject of this paper.
In.2
-
- Title
- TLA Based Face Tracking
- Authors
- Matthew Turk, Changbo Hu, Rogerio Feris,
Farshid Lashkari, Andy Beall
Computer Science Department ‡ Psychology Department
University of California, Santa Barbara, CA 93106
- Abstract
- Human face tracking (HFT) is one of several technologies
useful in vision-based interaction (VBI), which is one of
several technologies useful in the broader area of
perceptual user interfaces (PUI). In this paper we
motivate our interests in PUI and VBI, and describe our
recent efforts in various aspects of face tracking in the
Interaction Lab at UCSB. The HFT methods (GWN, EHT,
and CFD), in the context of VBI and PUI, are part of an
overall “TLA approach” to face tracking.
In.3
-
- Title
- OpenCV: Examples of Use and New Applications in Stereo, Recognition and Tracking
- Authors
- Gary R Bradski, Ph.D. Mgr. Vision, Graphics and Pattern Recognition Group
Intel Labs
Intel, Email:
gary.bradski@intel.com
- Abstract
- The Open Source Computer Vision Library (OpenCV) is an open source, free
for research AND commercial use, computer vision library started by Intel,
now
an open community effort with Intel contractors responsible for code,
documentation and bug fix integration as well as official builds. OpenCV
is written in C/C++ but will automatically load optimized assembly language
dynamic libraries included with the code. It now also has a Matlab
interface for convienient use with research code. See
http://www.intel.com/research/mrl/research/opencv/ for information on
how to get the code and/or join the user group.
In this talk, we overview the library content, give usage examples in C and
Matlab
and then present some new work supporting stereo vision,
face recognition, face feature tracking and Audio-Visual speech
recognition.
- Bio
- Gary graduated from U.C. Berkeley and after seven years of
traveling, working and consulting went back to graduate school to get a
Ph.D. from the Cognitive and Neural System's Department at Boston University
(http://cns-web.bu.edu/) studying mathematical models of biological memory
and vision. Through a sequence of events he still cannot fathom, he ended
up as a quantitative analyst developing interest rate option pricing models
on the options trading floor of First Union National Bank. He happily
returned to vision and pattern recognition research a few years later in
Intel Labs where he initiated OpenCV among other efforts. He is currently
manager of the Vision, Graphics and Pattern Recognition group at Intel Labs.
He has a wife and 2 children, takes advantage of the great hiking around
Silicon Valley, loves Volleyball when he's not injured and runs to stay sane
(perhaps not quite fast enough).
|
|