University of Texas
ECE

A Syntax for Image Understanding

Part of Seminar Series: ECE Distinguished Lecture Series

Date: Friday, April 3, 2009
Time: 1 p.m.
Location: ENS 637

Dr. Narendra Ahuja

Dr. Narendra Ahuja
Professor
University of Illinois at Urbana-Champaign

Abstract

Can we define a relatively general purpose image representation which would serve as the syntax for diverse needs of image understanding? What makes good image syntax? How do we evaluate it? In this talk, we present partial answers to these and related questions. The syntax we present is called Connected Segmentation Tree (CST), defined in terms of image regions, or segments. It captures the recursive embedding of all regions, their geometric and photometric properties, and their spatial layout. We describe the derivation of CST from images. We discuss its invariance to changes in imaging conditions (e.g., lighting, scale, orientation), and its ability to isolate and simplify inference of semantics, as would be expected from any syntax. We present our evaluation of CST through its performance on the following basic recognition problems.

As the first problem, we wish to discover a priori unknown themes that may characterize a given, random or strategically chosen, set of images. If objects from a certain categories occur frequently in the set, we say that the categories constitute the theme. No specific categories are specified by the user; indeed, they are not even known to the user a priori. Whether, how many, or where instances of any categories appear in a specific image is also not known. To this end, we develop answers to the following basic questions. What is an object category? If, and to what extent, is human supervision necessary to communicate the nature of categories to a computer vision system? What properties should be used to define a good category representation? We define an object category as consisting of (2D) subimages that have similar photometric, geometric and topological properties. We pose the following subproblems: (1) Discovering whether any categories occur in the image set. (2) Building a compact model that captures the intrinsic nature of the categories. (3) Learning the relationships among the different categories, thus building a taxonomy of all discovered categories. (4) Using the learned taxonomy to recognize all occurrences of all categories in previously unseen images. (5) Segmenting each occurrence. (6) Explaining and articulating the reasons for recognition. We present solutions to (1-6) that are almost completely unsupervised.

The general nature of (1-6) helps extend their solutions to detecting themes of other kinds. As the second problem, we present one such extension, that of identifying and extracting stochastically repeating parts of visual textures, commonly called texture elements. We evaluate the performance of CST here through the quality of detected elements in real-world textures.

Speaker Biography

Narendra Ahuja received the B.E. degree with honors in electronics engineering from the Birla Institute of Technology and Science, Pilani, India, in 1972, the M.E. degree with distinction in electrical communication engineering from the Indian Institute of Science, Bangalore, India, in 1974, and the Ph.D. degree in computer science from the University of Maryland, College Park, USA, in 1979. From 1974 to 1975 he was Scientific Officer in the Department of Electronics, Government of India, New Delhi. From 1975 to 1979 he was at the Computer Vision Laboratory, University of Maryland, College Park. Since 1979 he has been with the University of Illinois at Urbana-Champaign where he is currently Donald Biggar Willet Professor in the Department of Electrical and Computer Engineering, the Beckman Institute, and the Coordinated Science Laboratory.

His current research is focused on extraction and representation of spatial structure in images and video; integrated use of multiple image-based sources for scene representation and recognition; versatile sensors for computer vision; and applications including visual communication, image manipulation, and information retrieval.

He received the 1999 Emanuel R. Piore award of the IEEE, and the 1998 Technology Achievement Award of the International Society for Optical Engineering, and 2008 TA Stewart-Dyer/Frederick Harvey Trevithick Prize of the Institution of Mechanical Engineers, and 2008 Open Innovation Research Award from Hewlett-Packard. He was selected as Associate for 1998-99 and 2006-07 and Beckman Associate for 1990-91 in the University of Illinois Center for Advanced Study. He received Distinguished Alumnus Award from University of Maryland Department of Computer Science (2008), Best Paper Award from IEEE Transaction on Multimedia (2006), University Scholar Award (1985), Presidential Young Investigator Award (1984), National Scholarship (1967-72), and President's Merit Award (1966). He has co-authored the books Pattern Models (Wiley, 1983), Motion and Structure from Image Sequences (Springer-Verlag, 1992), and Face and Gesture Recognition (Kluwer, 2001); and co-edited the book Advances in Image Understanding (IEEE Press, 1996). He is a fellow of IEEE, American Association for Artificial Intelligence, International Association for Pattern Recognition, Association for Computing Machinery, American Association for the Advancement of Science, and International Society for Optical Engineering. He is on the editorial boards of the journals IEEE Transactions on Pattern Analysis and Machine Intelligence; Computer Vision, Graphics, and Image Processing; Journal of Mathematical Imaging and Vision; Journal of Pattern Analysis and Applications; Int. Journal of Imaging Systems and Technology; Journal of Information Science and Technology; and IEE Japan transactions on Electrical and Electronic Engineering; and and a guest coeditor of the Artificial Intelligence Journal's special issue on vision. He was the Founding Director of International Institute of Information Technology, Hyderabad where he continues to serve as Director International.