13th Conference on Computer and Robot Vision

Victoria, British Columbia.   June 1-3, 2016.

Welcome to the home page for CRV 2016 which will be held at the University of Victoria, Victoria, British Columbia.

CRV is an annual conference hosted in Canada, and co-located with Graphics Interface (GI) and Artificial Intelligence (AI). A single registration covers attendance in all three conferences. Please see the AI/GI/CRV general conference website for more information.


Congratulations to the 2016 award winners.
    • CIPPRS Research Excellence Award: Allan Jepson

      Professor Allan Jepson is recognized for his fundamental contributions to computer vision research. A Professor of Computer Science at the University of Toronto, Jepson published a broad range of influential papers, on optical flow estimation, image segmentation, visual tracking, perceptual grouping, and structure-from-motion. His accomplishments include the first application of phase-based analysis to optical flow and binocular stereopsis. His contributions to visual tracking include landmark papers on Eigen-Tracking, and online adaptive appearance models for visual tracking. Prof. Jepson was an Associate of the Canadian Institute for Advanced Research from 1986-1989 and 2004-2009. Over his career he has supervised an impressive number of graduate students and post-doctoral fellows, many of whom are now leading computer vision researchers in their own right.

    • CIPPRS Service Excellence Award: Philippe Giguere

      Prof. Philippe Giguere of the Universite Laval, for his many contributions to the organization of the Computer and Robot Vision Conference in 2011 in St. John's, Newfoundland, and 2012 in Toronto, Ontario. Prof. Giguere joined Laval in 2010, and has continued to be an active member of the robotic vision community.

    • Best Robotic Vision Paper Award:Light at the End of the Tunnel: High-Speed, Lidar-Based Train Localization in Challenging Underground Environments
      Tyler Daoust, Francois Pomerleau and Timothy Barfoot
    • Best Computer Vision Paper Award:Learning Neural Networks with Ranking-based Losses for Action Retrieval
      Md Atiqur Rahman and Yang Wang
    • CIPPRS 2015 Doctoral Dissertation Award: Prior Knowledge for Targeted Object Segmentation in Medical Images
      Masoud S. Nosrati, Simon Fraser University
    • CIPPRS 2015 Doctoral Dissertation Award (Honorable Mention): The Geometry of Cardiac Myofibers
      Emmanuel Piuze, McGill University


    • May 31, 2016: CRV 2016 proceedings are available at http://conferences.computer.org/crv/2016.
      Username and password needed to access this site is provided in your conference package.
    • Apr 29, 2016: CRV 2016 program is now available on the web. Please check the program tab.
    • Apr 28, 2016: Please check the submission tab for instructions about oral and poster presentations
    • Apr 6, 2016: Paper authors can request a visa invitation letter to attend CRV 2016 via CMT.
    • Apr 6, 2016: Camera ready prepration instructions have been emailed to the authors of accepted papers.
    • Apr 1, 2016: Paper decisions are now available via CMT. The camera ready copies are due by 10 April 2016, midnight.
    • Feb 19, 2016: Submission deadline has been extended to midnight, Wednesday, March 2nd, 2016. We look forward to receiving your contributions.
    • Feb 1, 2016: The paper submission system is now ready.
    • July 7, 2015: The 2016 Conference dates are now up to date.
    • August 27, 2015: Call for papers is now available.
    • August 27, 2015: Keynote speakers announced: Prof. Jim Little, UBC and Prof. Laurent Itti, USC will be giving the keynotes at CRV 2016. For more information, please checkout the Keynotes tab.

Important Dates:

Paper submission February 21, 2016, 11:59 PM PST Extended to March 2nd, 2016, 11:59 PST
Acceptance/rejection notification March 31, 2016
Revised camera-ready papers April 10, 2016
Early registration April 19, 2016
Conference June 1-3, 2016

Conference History

In 2004, the 17th Vision Interface conference was renamed the 1st Canadian Conference on Computer and Robot Vision. In 2011, the name was shortened to Conference on Computer and Robot Vision.

CRV is sponsored by the Canadian Image Processing and Pattern Recognition Society (CIPPRS).


CRV 2016 Program

The higher-level view of the joint conference program which also includes the AI and GI meetings is available here


DAY 1: Wednesday, June 1, 2016

8:30-9:00 AM -- Joint Conference Welcome

9:00-9:50 AM -- Symposium: Learning

  • Richard Zemel, Univ. of Toronto
    "Learning to Generate Images and Their Descriptions "
  • David Meger, McGill Univ.
    "Learning and Adapting Robotic Behaviours"

9:50-10:35 AM -- Oral Presentations: Learning

  • Learning Neural Networks with Ranking-based Losses for Action Retrieval
    Md Atiqur Rahman and Yang Wang
  • Genaralizing Generative Models: Application to Image Super-resolution
    Yanyan Mu, Roussos Dimitrakopou and Frank Ferrie
  • Dense Image Labeling Using Deep Convolutional Neural Networks
    Md Amirul Islam, Neil D B Bruce and Yang Wang

10:35-11:00 AM -- Break

11:00-12:00 PM -- KEYNOTE

Jim Little, University of British Columbia
Title: Pose and Action in a Sports Context

12:00-12:30 PM -- Oral Presentations: Keynote

  • Real-Time Human Motion Capture with Multiple Depth Cameras
    Alireza Shafaei and Jim Little
  • Robust Non-Saliency Guided Watermarking
    Ahmed Gawish, Paul Fieguth, Christian Scharfenberger, Alexander Wong, David Clausi and Hongbo Bi

12:30-2:00 PM -- Lunch

2:00-2:50 PM -- Symposium: Robotic Vision

  • Jonathan Kelly, Univ. of Toronto
    "Are All Features Created Equal? Machine Learning for Predictive Noise Modelling"
  • Hong Zhang, Univ. of Alberta
    "Consensus Constraint - A Method for Pruning Outliers in Keypoint Matching"

2:50-3:35 PM -- Oral Presentations: Robotic Vision

  • Sensor Planning for 3D Visual Search with Task Constraint
    Amir Rasouli and John K. Tsotsos
  • Obstacle Detection for Image-Guided Surface Water Navigation
    Tanmana Sadhu, Alexandra Branzan Albu, Maia Hoeberechts, Eduard Wisernig and Brian Wyvill
  • Indoor Place Recognition System for Localization of Mobile Robots
    Raghavender Sahdev and John K. Tsotsos

3:35-4:00 PM -- Break

4:00-4:50 PM -- Symposium: 3D Vision

  • Michael Greenspan, Queens Univ. (CANCELLED)
    "Two Approaches to Object Recognition in 3D Data, Without Descriptors"
  • Martin Jagersand, Univ. of Alberta
    "Local 3D vision using several different multiview geometry constraints"

4:50-5:35 PM -- Oral Presentations: 3D Vision

  • 3D Texture Synthesis for Realistic Architectural Modeling
    Felix Labrie-Larrivee, Denis Laurendeau and Jean-Francois Lalonde
  • Dense and Occlusion-Robust Multi-View Stereo for Unstructured Videos
    Jian Wei, Benjamin Resch and Hendrik Lensch
  • Aligning 3D local data of leapfrog locations along elongated structures
    Somayeh Hesabi and Denis Laurendeau

6:00-9:00 PM -- RECEPTION

DAY 2: Thursday, June 2, 2016

9:00-9:50 AM -- Symposium: Robotic Perception

  • Michael Jenkin, York Univ.
    "Sensing for autonomous systems"
  • Alan Mackworth, Univ. of British Columbia
    "Robot Perception for Shared Control of Powered Wheelchairs"

9:50-10:35 AM -- Oral Presentations: Robotic Perception

  • Modular Decomposition and Analysis of Registration based Trackers
    Abhineet Singh, Ankush Roy, Xi Zhang and Martin Jagersand
  • Light at the End of the Tunnel: High-Speed, Lidar-Based Train Localization in Challenging Underground Environments
    Tyler Daoust, Francois Pomerleau and Timothy Barfoot
  • Subsea fauna enumeration using vision-based marine robots
    Karim Koreitem, Yogesh Girdhar, Gregory Dudek, Jesus Pineda, Hanumant Singh and Walter Cho

10:35-11:00 AM -- Break

11:00-11:50 PM -- Symposium: New Applications

  • Alexandra Albu, Univ. of Victoria
    "Computer Vision for Environmental Monitoring"
  • James J. Clark, McGill Univ.
    "Color Sensing and Display at Low Light Levels"

12:50-12:35 PM -- Oral Presentations: New Applications

  • Metric Feature Indexing for Interactive Multimedia Search
    Ben Miller and Scott McCloskey
  • Video Object Segmentation for Content-Aware Video Compression
    Lu Sun and Jochen Lang
  • Registration of Modern and Historic Imagery for Timescape Creation
    Heider Ali and Anthony Whitehead

12:35-2:00 PM -- Lunch

2:00-3:00 PM -- KEYNOTE

Laurent Itti, University of University of Southern California
Title: Computational modeling of visual attention and object recognition in complex environments

3:00-3:30 PM -- Oral Presentations: Keynote

  • Automating Node Pruning for LiDAR-based Topometric Maps in the context of Teach-and-Repeat
    David Landry and Philippe Giguere
  • Hierarchical Grouping Approach for Fast Approximate RGB-D Scene Flow
    Francis Li, Alexander Wong and John Zelek

3:30-4:00 PM -- Break

4:00-5:35 PM -- POSTER SESSION (the list of posters is available below)

5:35-6:00 PM -- CIPPRS AGM (MAC A144)

6:00-8:00 PM -- Banquet (University Club)

DAY 3: Friday, June 3, 2016

9:00-9:50 AM -- Symposium: Grasping

  • Bryan Tripp, Univ. of Waterloo
    "Learning to grasp"
  • Philippe Giguere, Univ. of Laval
    "Grasping: Towards Data-Driven and Representation-Learning Approaches"

9:50-10:35 AM -- Oral Presentations: Grasping

  • Hand-Object Interaction and Precise Localization in Transitive Action Recognition
    Amir Rosenfeld and Shimon Ullman
  • Streaming Monte Carlo Pose Estimation for Autonomous Object Modeling
    Christian Rink and Simon Kriegel
  • Practical Considerations of Uncalibrated Visual Servoing
    Oscar Ramirez and Martin Jagersand

10:35-11:00 AM -- Break

11:00-12:00 PM -- PhD Dissertation Award Winners

  • Prior Knowledge for Targeted Object Segmentation in Medical Images
    Masoud S. Nosrati, Simon Fraser University
    CIPPRS 2015 Doctorial Dissertation Winner
  • The Geometry of Cardiac Myofibers
    Emmanuel Piuze, McGill University
    CIPPRS 2015 Doctorial Dissertation Honourable Mention

12:00-12:30 PM -- Oral Presentations: PhD Awards

  • Performance Assessment of Predictive Lane Boundary Detection for Non-uniformly Illuminated Roadway Driving Assistance
    Avishek Parajuli, Mehmet Celenk and Bryan Riley
  • Data-driven Probabilistic Occlusion Mask to Promote Visual Tracking
    Kourosh Meshgi, Shin-ichi Maeda, Shigeyuki Oba and Shin Ishii

12:30-2:00 PM -- Lunch

2:00-2:50 PM -- Symposia: Medical Imaging

  • Tal Arbel, McGill Univ.
    "Iterative Hierarchical Probabilistic Graphical Model for the Detection and Segmentation of Multiple Sclerosis Lesions in Brain MRI"
  • Mehran Ebrahimi, Univ. of Ontario Inst. of Tech.
    "Inverse Problems in Medical Image Processing"

2:50-3:35 PM -- Oral Presentations: Medical Imaging

  • A Comparative Study of Sparseness Measures for Segmenting Textures
    Melissa Cote and Alexandra Branzan Albu
  • Image Restoration via Deep-Structured Stochastically Fully-Connected Conditional Random Fields (DSFCRFs) for Very Low-Light Conditions
    Audrey Chung, Mohammad Javad Shafiee and Alexander Wong
  • Time-Frequency Domain Analysis via Pulselets for Non-Contact Heart Rate Estimation from Remotely Acquired Photoplethysmograms
    Brendan Chwyl, Audrey Chung, Robert Amelard, Jason Deglint, David Clausi and Alexander Wong

3:35-4:00 PM -- Break

4:00-4:50 PM -- Symposia: Human Robot Interaction

  • Richard Vaughan, Simon Fraser Univ.
    "Multi-Human, Multi-Robot Interaction at Various Scales"
  • Elizabeth Croft, Univ. of British Columbia
    "Up close and personal with human-robot collaboration"

4:50-5:35 PM -- Oral Presentations: Human Robot Interaction

  • Integration of Uncertainty in the Analysis of Dyadic Human Activities
    Maryam Ziaeefard and Robert Bergevin
  • Tiny People Finder: Long-Range Outdoor HRI By Periodicity Detection
    Jake Bruce, Valiallah (Mani) Monajjemi, Jens Wawerla and Richard Vaughan

List of Posters (Thu. June 2, 4:00-5:35 PM)

Improving Random Forests by correlation-enhancing projections and sample-based sparse discriminant selection
Marcus Wallenberg and Per-Erik Forssn

Video-rate Panorama for Free-viewpoint TV
Pierre Boulanger and Usman Aziz

Towards real-time detection, tracking, and classification of natural video using biological motion
Laura Ray and Tianshun Miao

A Phase-Entrained Particle Filter for Audio-Locomotion Synchronization
Peyman Manikashani and Jeffrey E. Boyd

Recognizing people and their activities in surveillance video: technology state of readiness and roadmap
Dmitry Gorodnichy and David David Bissessar

Mobile-target Tracking via Highly-maneuverable VTOL UAVs with EO Vision
Alex Ramirez-Serrano

Performance Evaluation of Bottom-Up Saliency Models for Object Proposal Generation
Anton Knaub, Vikram Narayan and Markus Adameck

An Automated Mitosis Recognition Technique for Breast Cancer Biopsy using Statistical Color, and Shape Based Features
Tahir Mahmood, Sheikh Ziauddin and Ahmad R.Shahid

What is a Good Model for Depth from Defocus?
Fahim Mannan and Michael Langer

Blur Calibration for Depth from Defocus
Fahim Mannan and Michael Langer

Generation of Spatial-temporal Panoramas with a Single Moving Camera
Xin Gao, Bingqing Yu, Liqiang Ding and James Clark

TIGGER: A Texture-Illumination Guided Global Energy Response Model for Illumination Robust Object Saliency
Sara Greenberg, Audrey Chung, Brendan Chwyl and Alexander Wong

Computer Vision-Based Detection of Violent Individual Actions Witnessed by Crowds
Kawthar Moria, Alexandra Branzan Albu and Kui Wu

Keypoint Recognition with Normalized Color Histograms
Clark Olson and Siqi Zhang

A descriptor and voting scheme for fast 3D self-localization in man-made environments
Marvin Gordon, Marcus Hebel and Michael Arens

Fusing Iris, Palmprint and Fingerprint in a Multi-Biometric Recognition System
Habibeh Naderi, Behrouz Haji Soleimani, Stan Matwin, Babak Nadjar Araabi and Hamid Soltanian-Zadeh

Human Action Recognition by Fusing the Outputs of Individual Classifiers
Eman Mohammadi Nejad, Q.M. Jonathan Wu and Mehrdad Saif

Crack detection in as-cast steel using laser triangulation and machine learning
Joshua Veitch-Michaelis, Yu Tao, Benjamin Crutchley, Jonathan Storey, David Walton and Jan-Peter Muller

Design of a Saccading and Accommodating Robot Vision System
Scott Huber, Ben Selby and Bryan Tripp

Change Point Estimation of the Hippocampal Volumes in Alzheimer's Disease
Xiaoying Tang, Marilyn Albert, Michael Miller and Laurent Younes

Development of a New Dactylology and Writing Support System Especially for Blinds
Gourav Modanwal

Evaluation of Shape Description Metrics applied to Human Silhouette Tracking
Olivier St-Martin Cormier and Frank Ferrie

smarttalk: A Learning-based Framework for Natural Human-Robot Interaction
Junaed Sattar and Cameron Fabbri

KinectScenes: Robust Real-time RGB-D Fusion for Animal Behavioural Monitoring
Jeyavithushan Jeyaloganathan and John Zelek

Synthetic Viewpoint Prediction
Andy Hess, Nilanjan Ray and Hong Zhang

Urban safety prediction using context and object information via double-column convolutional neural network
Hyeon-Woo Kang and Hang-Bong Kang

Depth Image Superpixels for Foliage Modeling
Daniel Morris, Saif Imran, Jin Chen and David Kramer

Reference Image based Color Correction for Multi-Camera Panoramic High Resolution Video
Radhakrishna Dasari, Dongqing Zhang and Changwen Chen

Multi-Player Detection with articulated Mixtures-of-Parts Representation Constrained by Global Appearance
Yang Fang, Long Wei Shi and Sik Jo

Road Segmentation in Street View Images Using Texture Information
David Abou Chacra and John Zelek

Point Pair Feature based Object Detection for Random Bin Picking
Wim Abbeloos and Toon Goedem

Anisotropic interpolation of sparse images
Amir Abbas Haji Abolhassani, Roussos Dimitrakopou and Frank Ferrie

Efficient Terrain Driven Coral Coverage Using Gaussian Processes for Mosaic Synthesis
Sandeep Manjanna, Nikhil Kakodkar, Malika Meghjani and Gregory Dudek

Texture-Aware SLAM Using Stereo Imagery And Inertial Information
Travis Manderson, Florian Shkurti and Gregory Dudek

Submission Instructions

Presentations and posters

  • Each oral presentation is alloted 15 minutes. 12 minutes for the talk + 3 minutes for QA.
  • Each symposia talk is 25 minutes long. 21 minutes for the talk + 4 minutes for discussion
  • The dimensions of a poster should not exceed 4 feet by 4 feet.

Camera Ready Preparation

Apr 6, 2016: Camera ready prepration instructions have been emailed to the authors of accepted papers.

Please prepare camera ready copies using the templates provided below. The maximum page limit is 8 pages (including references).

Paper Submission - Done!

The paper submission deadline for CRV 2016 has been extended to Wednesday, March 2nd, 2016, 11:59 PST.

Please refer to the Call For Papers for information on the goals and scope of CRV.

Please submit your papers using via Microsoft CMT system https://cmt.research.microsoft.com/CRV2016/Default.aspx.

The CRV review process is single-blind: authors are not required to anonymize submissions. Submissions must be between 4 to 8 pages (two-column) long. Submissions less than 6 pages will most likely be considered for poster sessions only. Use the following templates to prepare your CRV submissions:

CRV 2016 Co-Chairs

  • Faisal Qureshi, University of Ontario Institute of Technology
  • Steven L. Waslander, University of Waterloo

CRV 2016 Program Committee

  • Mohand Said Allili, Université du Québec en Outaoauis, Canada
  • Robert Allison, York University, Canada
  • Alexander Andreopoulos, IBM Research, Canada
  • John Barron, University of Western Ontario, Canada
  • Steven Beauchemin, University of Western Ontario, Canada
  • Robert Bergevin, Université Laval, Canada
  • Guillaume-Alexandre Bilodeau, École Polytechnique Montréal, Canada
  • Pierre Boulanger, University of Alberta, Canada
  • Jeffrey Boyd, University of Calgary, Canada
  • Marcus Brubaker, University of Toronto, Canada
  • Gustavo Carneiro, University of Adelaide, Australia
  • James Clark, McGill University, Canada
  • David Clausi, University of Waterloo, Canada
  • Jack Collier, DRDC Suffield, Canada
  • Kosta Derpanis, Ryerson University, Canada
  • James Elder, York University, Canada
  • Mark Eramian, University of Saskatchewan, Canada
  • Paul Fieguth, Waterloo, Canada
  • Brian Funt, Simon Fraser University, Canada
  • Philippe Giguère, Laval University, Canada
  • Yogesh Girdhar, Woods Hole Oceanographic Institute, USA
  • Minglun Gong, Memorial University of Newfoundland, Canada
  • Michael Greenspan, Queens University, Canada
  • Jessy Hoey, University of Waterloo, Canada
  • Andrew Hogue, University of Ontario Institute of Technology, Canada
  • Randy Hoover, South Dakota School of Mines and Technology, USA
  • Martin Jagersand, University of Alberta, Canada
  • Michael Jenkin, York University, Canada
  • Hao Jiang, Boston College, USA
  • Pierre-Marc Jodoin, Université de Sherbrooke, Canada
  • Jonathan Kelly, University of Toronto, Canada
  • Dana Kulic, University of Waterloo, Canada
  • Robert Laganière, University of Ottawa, Canada
  • Jean-Francois Lalonde, Laval University, Canada
  • Tian Lan, Stanford University, USA
  • Jochen Lang, University of Ottawa, Canada
  • Michael Langer, McGill University, Canada
  • Cathy Laporte, ETS Montreal, Canada
  • Denis Laurendeau, Laval University, Canada
  • Howard Li, University of New Brunswick, Canada
  • Jim Little, University of British Columbia, Canada
  • Shahzad Malik, University of Toronto, Canada
  • Scott McCloskey, Honeywell Labs, USA
  • David Meger, McGill University, Canada
  • Jean Meunier, Universite de Montreal, Canada
  • Max Mignotte, Universite de Montreal, Canada
  • Gregor Miller, University of British Columbia, Canada
  • Greg Mori, Simon Fraser University, Canada
  • Christopher Pal, École Polytechnique Montréal, Canada
  • Pierre Payeur, University of Ottawa, Canada
  • Cédric Pradalier, Georgia Tech. Lorraine, France
  • Yiannis Rekleitis, University of South Carolina, USA
  • Sebatien Roy, Université de Montréal, Canada
  • Junaed Sattar, Clarkson University, USA
  • Christian Scharfenberger, University of Waterloo, Canada
  • Hicham Sekkati, University of Waterloo, Canada
  • Kaleem Siddiqi, McGill University, Canada
  • Gunho Sohn, York University, Canada
  • Minas Spetsakis, York University, Canada
  • Uwe Stilla, Technische Universitaet Muenchen, Germany
  • Graham Taylor, University of Guelph, Canada
  • John Tsotsos, York University, Canada
  • Olga Veksler, University of Western Ontario, Canada
  • Ruisheng Wang, University of Calgary, Canada
  • Yang Wang, University of Manitoba, Canada
  • Alexander Wong, Waterloo University, Canada
  • Robert Woodham, University of British Columbia
  • Yijun Xiao, University of Edinburgh United Kingdom
  • Alper Yilmaz, Ohio State University, USA
  • John Zelek, University of Waterloo Ontario, Canada
  • Hong Zhang, University of Alberta, Canada

CIPPRS Executive

  • President: Gregory Dudek, McGill University
  • Treasurer: John Barron, Western University
  • Secretary: Jim Little, University of British Columbia

Keynote Speakers

We will have two Keynote speakers in 2016:

Laurent Itti, University of University of Southern California

Laurent Itti

Title: Computational modeling of visual attention and object recognition in complex environments


Visual attention and eye movements in primates have been widely shown to be guided by a combination of stimulus-dependent or 'bottom-up' cues, as well as task-dependent or 'top-down' cues. Both the bottom-up and top-down aspects of attention and eye movements have been modeled computationally. Yet, is is not until recent work which I will describe that bottom-up models have been strictly put to the test, predicting significantly above chance the eye movement patterns, functional neuroimaging activation patterns, or most recently neural activity in the superior colliculus of monkeys inspecting complex dynamic scenes. In recent developments, models that increasingly attempt to capture top-down aspects have been proposed. In one system which I will describe, neuromorphic algorithms of bottom-up visual attention are employed to predict, in a task-independent manner, which elements in a video scene might more strongly attract attention and gaze. These bottom-up predictions have more recently been combined with top-down predictions, which allowed the system to learn from examples (recorded eye movements and actions of humans engaged in 3D video games, including flight combat, driving, first-person, or running a hot-dog stand that serves hungry customers) how to prioritize particular locations of interest given the task. Pushing deeper into real-time, joint online analysis of video and eye movements using neuromorphic models, we have recently been able to predict future gaze locations and intentions of future actions when a player is engaged in a task. Finally, employing deep neural networks, we show how neuroscience-inspired algorithms can also achieve state-of-the art results in the domain of object recognition, especially over a new dataset collected in our lab and comprising ~22M images of small objects filmed on a turntable, with available pose information that can be used to enhance training of the object recognition model.


Dr. Laurent Itti is a Full Professor of Computer Science, Psychology, and Neuroscience at the University of Southern California. Dr. Itti's research interests are in biologically-inspired computational vision, in particular in the domains of visual attention, scene understanding, control of eye movements, and surprise. This basic research has technological applications to, among others, video compression, target detection, and robotics. His work on visual attention and saliency has been widely adopted, with an explosion of applications not only in neuroscience and psychology, but also in machine vision, surveillance, defense, transportation, medical diagnosis, design and advertising. Itti has co-authored over 150 publications in peer-reviewed journals, books and conferences, three patents, and several open-source neuromorphic vision software toolkits.

Jim Little, University of British Columbia

Jim Little

Title: Pose and Action in a Sports Context


Understanding human action is critical to human collaboration with robots but it also increasingly important for surveillance, monitoring, and situation understanding. To make understanding more tractable, we study sports broadcasts where the types of actions are limited and depend on roles, locations, and situations. To understand the game we need at least to track the players, identify them, determine their actions, and connect their actions into the situation of the game. I will recap our work on processing visual and textual information in hockey and basketball broadcasts. Typically we learn the connection between sequences of images and events in the game. But acquiring ground truth regarding the state of the players and game events is challenging in itself. To find groundtruth we make use of motion capture data with associated image sequences. In order to understand actions we explore the link between the pose of the players and their actions, using static and dynamic information. We show how can learn the connection between the motion of players in image sequences and their 3d pose by using a representation that is reliably computed from image sequences and can be synthesized from the motion capture (mocap) data. This allows us to retrieve 3d pose sequences from the database. Moreover, we can do this reliably despite variations in speed of movement in the images and the mocap database. Further, one can use the same mocap database to synthesize motion descriptors from mocap across varying viewpoints. This permits us to recognize motions from previously unseen viewpoints.


Dr. Little is a Full Professor of Computer Science at the University of British Columbia. He investigates both mobile robotics and computer vision, and in both areas works on understanding human activity. Mobile robots must recognize the activities of the humans they help. He has pioneered work on recognizing persons by their gait, based on characteristics of timing and location of motion (over 500 citations) and his work on tracking people based using boosted detections and particle filters has over 1000 citations. His continuing interests centre on motion, actions, activities and situations, specifically in sports videos. His recent work has addressed recognition of pose over time, identification of players in sports videos, and use of motion capture data for providing models for action recognition from unseen views.


CRV 2015 will feature eight exciting symposia on subtopics related to computer and robot vision.

Robotic Perception

  • Michael Jenkin, York Univ.

    "Sensing for autonomous systems"

    Entities that move must somehow transduce the vast range of sensor data available to them in order to build representations of self and place that are sufficient to enable the entity to perform complex tasks in their environment. What can machine algorithms learn from biological solutions to these problems? This talk will review some recent results in multi-cue integration in the biological domain and explore how biological solutions to the problems of determining self-orientation and ego motion estimation from multiple sensors can be applied to the development of algorithms for mobile robots.

  • Alan Mackworth, Univ. of British Columbia

    "Robot Perception for Shared Control of Powered Wheelchairs"

    Given our aging population there is a great need to develop appropriate technology to aid older adults. We know that quality of life depends on mobility. However, older adults often lack the strength required for manual wheelchair use, indicating a requirement for powered wheelchairs. But mobility impairments in older adults are often accompanied by co-morbidities such as dementia. The need for a significant level of cognitive capacity to safely operate current powered wheelchair technology tends to exclude those users with cognitive impairments. Given improvements in sensor systems, with lower cost, better accuracy, lower power and smaller size, increases in computing power and enhancements in robotic perception, it is now possible to design safer semi-autonomous powered wheelchairs with shared user control. In this talk three experiments in designing and implementing such systems, and testing them in realistic settings, will be described. This talk describes work carried out jointly with Pooja Viswanathan, Bikram Adhikari, Ian Mitchell and James Little.

Robotic Vision

  • Jonathan Kelly, Univ. of Toronto

    "Are All Features Created Equal? Machine Learning for Predictive Noise Modelling"

    Many computer vision and robotics algorithms make strong assumptions about uncertainty, for example, that all measurements are equally informative, and rely on the validity of these assumptions to produce accurate and consistent state estimates. However, in practice, certain environments or robot actions may influence sensor performance in predictable ways that are difficult or impossible to capture by manually tuning uncertainty parameters. In this talk, I will describe a fast nonparametric Bayesian inference method to more accurately model sensor uncertainty. The method intelligently weights observations based on a predicted measure of informativeness, using a learned model of information content from training data collected under relevant operating conditions. Learning can be carried out both with and without knowledge of the motion used to generate the sensor measurements (i.e., ground truth). I will present example results from a series of stereo visual odometry experiments, which demonstrate significant improvements in localization accuracy when compared to standard noise modelling techniques.

  • Hong Zhang, Univ. of Alberta

    "Consensus Constraint - A Method for Pruning Outliers in Keypoint Matching"

    In this talk, I will describe a simple yet effective outlier pruning method for keypoint matching that is able to perform well under significant illumination changes, for applications including visual loop closure detection in robotics. We contend and verify experimentally that a major difficulty in matching keypoints when illumination varies significantly between two images is the low inlier ratio among the putative matches. The low inlier ratio in turn causes failure in the subsequent RANSAC algorithm since the correct camera motion has as much support as many of the incorrect ones. By assuming a weak perspective camera model and planar camera motion, we derive a simple constraint on correctly matched keypoints in terms of the flow vectors between the two images. We then use this constraint to prune the putative matches to boost the inlier ratio significantly thereby giving the subsequent RANSAC algorithm a chance to succeed. We validate our proposed method on multiple datasets, to show convincingly that it can deal with illumination change effectively in many computer vision and robotics applications where our assumptions hold true, with a superior performance to state-of-the-art keypoint matching algorithms.


  • Bryan Tripp, Univ. of Waterloo

    "Learning to grasp"

    An important issue for autonomous robots in homes and other unstructured environments is visually guided grasping of objects with complex shapes. This is a difficult problem, which involves decisions about which part of the object to grasp and how to configure the gripper relative to that point. This talk will focus on biologically-inspired approaches. We will review the neural systems that control grasping in primates, and discuss how to structure analogous deep convolutional networks. Training such networks requires large labelled datasets, and we will present a large database of simulated grasps that uses mesh warping to grow indefinitely, as well as challenges in generalization from simulations to physical robots.

  • Philippe Giguere, Univ. of Laval

    "Grasping: Towards Data-Driven and Representation-Learning Approaches

    Despite over a decade of work on grasping, grasping is still a very challenging problem. Early attempts at it were based on simulators and models such as GraspIt!, where the goal was to identify so-called grasping points. However, these did not take into account the geometry of simple effectors such as parallel-plate grippers. To counter this, the grasping rectangle was introduced by Jiang et al. in 2011. This representation has two main advantages: first, a parallel-gripper configuration (orientation, opening) can be directly derived from the orientation of such rectangles. More importantly, it reframed the grasping problem into the more traditional vision problem of finding bounding boxes in images. We will present several solutions that have been proposed for this new paradigm, with a particular emphasis on learned-features approaches such as Deep Neural Network and Convolutional Neural Networks. Finally, we will present our own solution based on sparse-coding, which showed competitive results against these other state-of-the-art approaches.


  • Richard Zemel, Univ. of Toronto

    "Learning to Generate Images and Their Descriptions "

    Recent advances in computer vision, natural language processing and related areas has led to a renewed interest in artificial intelligence applications spanning multiple domains. Specifically, the generation of natural human-like captions for images has seen an extraordinary increase in interest. I will describe approaches that combine state-of-the-art computer vision techniques and language models to produce descriptions of visual content with surprisingly high quality. Related methods have also led to significant progress in generating images. The limitations of current approaches and the challenges that lie ahead will both be emphasized.

  • David Meger, McGill Univ.

    "Learning and Adapting Robotic Behaviours"

    There is a great need for robots that can operate in the world's most challenging and dynamic environments, such as oceans and rivers, but the conditions often limit direct human intervention for programming and tuning. I will describe recent work that employs Reinforcement Learning (RL) techniques to allow a swimming robot to acquire and adapt movement skills without input from human engineers. A limiting factor in the success of RL for live robots is the number of trials and computation required to produce effective policies. We overcome these challenges by employing model-based policy search method, which is known to achieve excellent data-efficiency. An efficient parametrization for swimming controllers and learning multiple tasks in parallel, leveraging cloud-computing resources, allows us to partially overcome computational limitations. Our results show that swimming gaits can be learned for the physical Aqua hexapod robot with as little as 8 trials being sufficient to learn effectively learn a swimming task from scratch.

Medical Imaging

  • Tal Arbel, McGill Univ.

    "Iterative Hierarchical Probabilistic Graphical Model for the Detection and Segmentation of Multiple Sclerosis Lesions in Brain MRI"

    Probabilistic graphical models have been shown to be effective in a wide variety of segmentation tasks in the context of computer vision and medical image analysis. However, segmentation of pathologies can present a series of additional challenges to standard techniques. In the context of lesion detection and segmentation in brain images of patients with Multiple Sclerosis (MS), for example, challenges are numerous: lesions can be subtle, heterogeneous, vary in size and can be very small, often have ill-defined borders, with intensity distributions that overlap those of healthy tissues and vary depending on location within the brain. In this talk, recent work on multi-level, probabilistic graphical models based on Markov Random Fields (MRF) will be described to accurately detect and segment lesions and healthy tissues in brain images of patients with MS. Robustness and accuracy of the methods are illustrated through extensive experimentation on very large, proprietary datasets of real, patient brain MRI acquired during multicenter clinical trials. Recent work on the successful adaptation of the method to the problem of brain tumour detection and segmentation into sub-classes will also be discussed.

  • Mehran Ebrahimi, Univ. of Ontario Inst. of Tech.

    "Inverse Problems in Medical Image Processing"


    In many practical problems in the field of applied sciences, the features of most interest cannot be observed directly, but have to be inferred from other, observable quantities. The problem of solving an unknown object from the observed quantities is called an inverse problem. Many classical problems, including image reconstruction from samples, denoising, deblurring, segmentation, and registration, can be modelled as inverse problems.

    Generally, many real-world inverse problems are ill-posed, mainly due to the lack of existence of a unique solution. The procedure of providing acceptable unique solutions to such problems is known as regularization. Indeed, much of the progress in image processing in the past few decades has been due to advances in the formulation and practice of regularization. This, coupled with progress in the areas of optimization and numerical analysis, has yielded much improvement in computational methods of solving inverse imaging problems.

    In this talk, we will review general theoretical and computational aspects of inverse theory. Furthermore, we will revisit a number of inverse problems including image registration and present some recent research ideas.

New Applications

  • Alexandra Albu, Univ. of Victoria

    "Computer Vision for Environmental Monitoring"

    Environmental monitoring encompasses a wide area of activities. For instance, conservation biologists are interested in monitoring species by taking photographs of individuals and tracking those individuals over time. Ecologists are interested in the tracking the evolution of natural habitats, and in identifying which environmental factors (such as, for instance, human presence, or ambient noise) trigger significant changes in the behavior of certain species. Since a vast majority of the data gathered during environmental observations consists of images and videos, computer vision can play a major role in increasing the efficiency of data analysis and understanding processes. Computer vision algorithms for environmental monitoring applications face two main challenges. The first one is related to the unstructured nature of data, i.e. the large variability in the quality and content of acquired imagery. Images are typically acquired using a variety of sensors (mostly handheld), with different backgrounds and from different viewpoints. The second challenge refers to the expert nature of the knowledge that needs to be embedded in the algorithms. Untrained humans are not able to perform the tasks of ecologists, biologists, and geologists. This means that the design of computer vision algorithms for environmental applications needs to be approached from an interdisciplinary viewpoint. My talk will present several success stories of computer vision algorithms designed for specific tasks of environmental monitoring, such as species identification, species abundance estimation, identification of individual animals, animal behavior analysis, change detection in mountain habitats, etc. It will also outline several directions of future work in environmental monitoring that will hopefully inspire and motivate the audience to contribute to this emerging area of interdisciplinary research.

  • James J. Clark, McGill Univ.

    "Color Sensing and Display at Low Light Levels"

    The measurement and display of color information becomes challenging at low light levels. At low light levels sensor noise becomes significant, creating difficulties in estimating color quantities such as hue and saturation. On the display side, especially in mobile devices, low brightness levels are often desired in order to minimize power consumption. The human visual system perceives color in a different manner at low light levels than at high light levels, and so display devices operating at low brightness systems should account for this. This talk will cover recent developments in the area of modeling low light level color perception in humans, and the application of these models to intelligent display technology.

Human Robot Interaction

  • Richard Vaughan, Simon Fraser Univ.

    "Multi-Human, Multi-Robot Interaction at Various Scales"

    Most HRI work is one-to-one face-to-face over a table. I'll describe my group's work on the rest of the problem, with multiple robots and people, indoors and outdoors, at ranges from 1m to 50m, on ground robots and UAVs. We have focused particularly on having humans and robots achieve mutual attention as a prerequisite for subsequent one-on-one interaction, and how uninstrumented people can create and modify robot teams. I'll also suggest that sometimes robots should not do as they are told, for the user's own good.

  • Elizabeth Croft, Univ. of British Columbia

    "Up close and personal with human-robot collaboration"

    Advances in robot control, sensing and intelligence are rapidly expanding the potential for close-proximity human-robot collaborative work. In many different contexts, from manufacturing assembly to home care settings, a robot's potential strength, precision and process knowledge can productively complement human perception, dexterity and intelligence to produce a highly coupled, coactive, human-robot team. Such interactions, however, require task-appropriate communication cues that allow each party to quickly share intentions and expectations around the task. These basic communication cues allow dyads, human-human or human-robot, to successfully and robustly pass objects, share spaces, avoid collisions and take turns. These are some of the basic building blocks of good, safe, and friendly collaboration regardless of one's humanity. In this talk we will discuss approaches to identifying, characterizing, and implementing communicative cues and validating their impact in human-robot interaction scenarios.

3D Vision

  • Michael Greenspan, Queens Univ. (CANCELLED)

    "Two Approaches to Object Recognition in 3D Data, Without Descriptors"

    The availability of consumer-level range sensors has motivated the advancement of processing methods more traditional to 2D intensity data, such as object recognition. When using 2D data, the standard approach to object recognition is based upon establishing correspondences among highly descriptive features. With 3D range data, there is an alternative that relies only on the distributions of the points themselves. This talk presents two recent advances to descriptor-free object recognition and registration. The first approach is a generalization of the 4PCS method that benefits from taking 3D structure into account through the selection of non-planar four point bases. The second method identifies virtual interest points from the intersections of geometric primitives extracted in the underlying data sets. These virtual interest points are both repeatable and sparse enough to facilitate successful efficient RANSAC-style matching.

  • Martin Jagersand, Univ. of Alberta

    "Local 3D vision using several different multiview geometry constraints"


    Most published 3D vision research reconstructs 3D data of a single form in a global coordinate system. Visual SLAM typically reconstructs a sparse map of landmark feature locations). Computer Vision (multiview) stereo typically reconstructs a dense point cloud. Popular active sensors (e.g. Kinect) provide dense data.

    By contrast, often only specific measurements are needed. A human surveyor makes just those needed measurements using geometric principles very similar to computer vision. Likewise, in many applicatione we need to be aware only of a small subset of the environmenta, e.g. robot needs to know imminent collision risks, and motion goals. This information can be extracted from a global 3D reconstruction, but can also be modeled and measured individually.

    Inthis lecture I will cover basic multiview geometry in Euclidean, affine and Projective formulation. I will then show how to use this to instead of a dense 3D reconstruction in a global coordinate system, create geometric measurements of different types. These are generally relative, rather than w.r.t. a global map, independent of each other, and can often be more accurate than a single global coordinate representation. Measurements based on different geometries can be combined. Uses of these will be shown in applications. For example to improve convergence of video tracking by constraining several individual 2D video trackers to obey multiview geometry constraints. Another example is in defining target alignment goals and obstacle avoidance for robots. I will show videos of several applicaiton examples.

Links to Previous Conferences

This page archives aa historical content from past CRV meetings. A second source for some of this information is maintained at the CIPPRS website.

Photo Credit:
Description : 2009-0605-Victoria-Harbor-PAN
Credit : © Bobak Ha'Eri - Own work. Licensed under CC BY 3.0 via Wikimedia Commons