A Guide to Convolutional Neural Networks for Computer Vision

A Guide to Convolutional Neural Networks for Computer VisionSalman Khan, Hossein Rahmani, Syed Afaq Ali Shah, Mohammed Bennamoun
ISBN: 9781681730219 | PDF ISBN: 9781681730226
Hardcover ISBN: 9781681732787
Copyright © 2018 | 210 Pages | Publication Date: February, 2018

Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision.

This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation.

This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models.

Table of Contents

Features and Classifiers
Neural Networks Basics
Convolutional Neural Network
CNN Learning
Examples of CNN Architectures
Applications of CNNs in Computer Vision
Deep Learning Tools and Libraries
Authors’ Biographies

About the Author(s)

Salman Khan, Data61-CSIRO and Australian National University
Salman Khan received a B.E. in Electrical Engineering from the National University of Sciences and Technology (NUST) in 2012 with high distinction, and a Ph.D. from The University of Western Australia (UWA) in 2016. His Ph.D. thesis received an Honorable Mention on the Dean’s list Award. In 2015, he was a visiting researcher with National ICT Australia, Canberra Research Laboratories. He is currently a Research Scientist with Data61, Commonwealth Scientific and Industrial Research Organization (CSIRO), and has been an Adjunct Lecturer with Australian National University (ANU) since 2016. He was awarded several prestigious scholarships such as the International Postgraduate Research Scholarship (IPRS) for Ph.D. and the Fulbright Scholarship for MS. He has served as a program committee member for several leading computer vision and robotics conferences such as IEEE CVPR, ICCV, ICRA, WACV, and ACCV. His research interests include computer vision, pattern recognition, and machine learning.

Hossein Rahmani, The University of Western Australia
Hossein Rahmani received his BSc. in Computer Software Engineering in 2004 from Isfahan University of Technology, Isfahan, Iran and his MSc. degree in Software Engineering in 2010 from Shahid Beheshti University, Tehran, Iran. He completed his Ph.D. from The University of Western Australia in 2016. He has published several papers in top conferences and journals such as CVPR, ICCV, ECCV, and TPAMI. He is currently a Research Fellow in the School of Computer Science and Software Engineering at The University of Western Australia. He has served as a reviewer for several leading computer vision conferences and journals such as IEEE TPAMI, and CVPR. His research interests include computer vision, action recognition, 3D shape analysis, and machine learning.

Syed Afaq Ali Shah, The University of Western Australia
Syed Afaq Ali Shah received his B.Sc. and M.Sc. degrees in Electrical Engineering from the University of Engineering and Technology (UET) Peshawar, in 2003 and 2010, respectively. He obtained his Ph.D. from the University of Western Australia in the area of computer vision and machine learning in 2016. He is currently working as a research associate in the school of computer science and software engineering at the University of Western Australia, Crawley, Australia. He has been awarded the “Start Something Prize for Research Impact through Enterprise” for 3D facial analysis project funded by the Australian Research Council. He has served as a program committee member for ACIVS 2017. His research interests include deep learning, computer vision, and pattern recognition.

Mohammed Bennamoun, The University of Western Australia
Mohammed Bennamoun received his M.Sc. from Queen’s University, Kingston, Canada in the area of Control Theory, and his Ph.D. from Queen’s QUT in Brisbane, Australia, in the area of Computer Vision. He lectured Robotics at Queen’s, and then joined QUT in 1993 as an associate lecturer. He is currently a Winthrop Professor. He served as the Head of the School of Computer Science and Software Engineering at The University of Western Australia (UWA) for five years (February 2007-March 2012). He served as the Director of a University Centre at QUT: The Space Centre for Satellite Navigation from 1998-2002. He served as a member of the Australian Research Council (ARC) College of Experts from 2013-2015. He was an Erasmus Mundus Scholar and Visiting Professor in 2006 at the University of Edinburgh.

He was also a visiting professor at CNRS (Centre National de la Recherche Scientifique) and Telecom Lille1, France in 2009, The Helsinki University of Technology in 2006, and The University of Bourgogne and Paris 13 in France in 2002-2003. He is the co-author of the book Object Recognition: Fundamentals and Case Studies Ontology Learning and Knowledge Discovery Using the Web, published in 2011. Mohammed has published over 100 journal papers and over 250 conference papers, and secured highly competitive national grants from the ARC, government, and other funding bodies. Some of these grants were in collaboration with industry partners (through the ARC Linkage Project scheme) to solve real research problems for industry, including Swimming Australia, the West Australian Institute of Sport, a textile company (Beaulieu Pacific), and AAM-GeoScan.
He worked on research problems and collaborated (through joint publications, grants, and supervision of Ph.D. students) with researchers from different disciplines including animal biology, speech processing, biomechanics, ophthalmology, dentistry, linguistics, robotics, photogrammetry, and radiology. He has collaborated with researchers from within Australia (e.g., CSIRO), as well as internationally (e.g. Germany, France, Finland, U.S.). He won several awards, including the Best Supervisor of the Year Award at QUT in 1998, an award for teaching excellence (research supervision), and the Vice-Chancellor’s Award for Research Mentorship in 2016. He also received an award for research supervision at UWA in 2008. He has served as a guest editor for a couple of special issues in international journals, such as the International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI). He was selected to give conference tutorials at the European Conference on Computer Vision (ECCV), the International Conference on Acoustics Speech and Signal Processing (IEEE ICASSP), the IEEE International Conference on Computer Vision (CVPR 2016), Interspeech (2014), and a course at the International Summer School on Deep Learning (DeepLearn2017).
He has organized several special sessions for conferences, including a special session for the IEEE International Conference in Image Processing (IEEE ICIP). He was on the program committee of many conferences, e.g., 3D Digital Imaging and Modeling (3DIM) and the International Conference on Computer Vision. He also contributed in the organization of many local and international conferences. His areas of interest include control theory, robotics, obstacle avoidance, object recognition, machine/deep learning, signal/image processing, and computer vision (particularly 3D).

You may also like...