Cover
About the Author
Acknowledgments
Preface
Acronyms
1 Introduction
1. 1.1 What Is Cognitive Vision
2. 1.2 Computational Approaches for Cognitive Vision
3. 1.3 A Brief Review of Human Vision System
4. 1.4 Perception and Cognition
5. 1.5 Organization of the Book
2 Early Vision
1. 2.1 Feature Integration Theory
2. 2.2 Structure of Human Eye
3. 2.3 Lateral Inhibition
4. 2.4 Convolution: Detection of Edges and Orientations
5. 2.5 Color and Texture Perception
6. 2.6 Motion Perception
7. 2.7 Peripheral Vision
8. 2.8 Conclusion
9. Notes
3 Bayesian Reasoning for Perception and Cognition
1. 3.1 Reasoning Paradigms
2. 3.2 Natural Scene Statistics
3. 3.3 Bayesian Framework of Reasoning
4. 3.4 Bayesian Networks
5. 3.5 Dynamic Bayesian Networks
6. 3.6 Parameter Estimation
7. 3.7 On Complexity of Models and Bayesian Inference
8. 3.8 Hierarchical Bayesian Models
9. 3.9 Inductive Reasoning with Bayesian Framework
10. 3.10 Conclusion
11. Notes
4 Late Vision
1. 4.1 Stereopsis and Depth Perception
2. 4.2 Perception of Visual Quality
3. 4.3 Perceptual Grouping
4. 4.4 Foreground–Background Separation
5. 4.5 Multi-stability
6. 4.6 Object Recognition
7. 4.7 Visual Aesthetics
8. 4.8 Conclusion
9. Notes
5 Visual Attention
1. 5.1 Modeling of Visual Attention
2. 5.2 Models for Visual Attention
3. 5.3 Evaluation
4. 5.4 Conclusion
5. Notes
6 Cognitive Architectures
1. 6.1 Cognitive Modeling
2. 6.2 Desiderata for Cognitive Architectures
3. 6.3 Memory Architecture
4. 6.4 Taxonomies of Cognitive Architectures
5. 6.5 Review of Cognitive Architectures
6. 6.6 Biologically Inspired Cognitive Architectures
7. 6.7 Conclusions
8. Notes
7 Knowledge Representation for Cognitive Vision
1. 7.1 Classicist Approach to Knowledge Representation
2. 7.2 Symbol Grounding Problem
3. 7.3 Perceptual Knowledge
4. 7.4 Unifying Conceptual and Perceptual Knowledge
5. 7.5 Knowledge-Based Visual Data Processing
6. 7.6 Conclusion
7. Notes
8 Deep Learning for Visual Cognition
1. 8.1 A Brief Introduction to Deep Neural Networks
2. 8.2 Modes of Learning with DNN
3. 8.3 Visual Attention
4. 8.4 Bayesian Inferencing with Neural Networks
5. 8.5 Conclusion
6. Notes
9 Applications of Visual Cognition
1. 9.1 Computational Photography
2. 9.2 Digital Heritage
3. 9.3 Social Robots
4. 9.4 Content Re-purposing
5. 9.5 Conclusion
6. Notes
10 Conclusion
1. 10.1 “What Is Cognitive Vision” Revisited
2. 10.2 Divergence of Approaches
3. 10.3 Convergence on the Anvil?
4. Notes
References
Index
End User License Agreement

List of Illustrations

Chapter 1
1. Figure 1.1 Hard challenges for computer vision. (a) “The offensive player ...
2. Figure 1.2 An overview of human vision system.
3. Figure 1.3 A simple process model in a cognitive system.
Chapter 2
1. Figure 2.1 Feature integration theory. (a) It is easy to identify a red “X” ...
2. Figure 2.2 Structure of human eye and retina. (a) A human eye. (b) Cross-sec...
3. Figure 2.3 Distribution and sensitivity of photoreceptors in human eye. (a) ...
4. Figure 2.4 Mach band illusion.
5. Figure 2.5 A computational model for lateral inhibition.
6. Figure 2.6 Schematic representation of center-surround operations in the eye...
7. Figure 2.7 Models for opponent color contrasts. (a) Dark–bright contrast. (b...
8. Figure 2.8 Illustrating aperture problem based on Hildreth (1984). (a) Motio...
9. Figure 2.9 Token-based motion estimation. Movement of three points shown in ...
10. Figure 2.10 Illustrating effect of crowding.
Chapter 3
1. Figure 3.1 Illustrating natural scene statistics.
2. Figure 3.2 A Bayesian network depicting causal relation between some externa...
3. Figure 3.3 A dynamic Bayesian network.
4. Figure 3.4 Parameter estimation from observations. ( = Number of cases of h...
5. Figure 3.5 Hierarchical Bayesian model for estimating color distribution of ...
6. Figure 3.6 Inferencing consequent region from observations (a) Generalizatio...
7. Figure 3.7 Taxonomy learning from data. (a) Less data-points: simpler hypoth...
8. Figure 3.8 Category determination with learned over-hypothesis. (a) Ambiguit...
Chapter 4
1. Figure 4.1 Distance estimation using stereopsis.
2. Figure 4.2 Depth Perception. (a) Perspective: corner A appears to be at the ...
3. Figure 4.3 Perceptual degradation of image. (a) Original image. (b, c) Non-s...
4. Figure 4.4 Imperfection in object contour detection. (a) A natural image. (b...
5. Figure 4.5 Some principles of perceptual grouping. (a) Grouping by proximity
6. Figure 4.6 Conflict in the laws of grouping by proximity and similarity. The...
7. Figure 4.7 Amodal and modal contour completion. (a) Amodal completion: The c...
8. Figure 4.8 Grouping by good continuity. (a) Perception of helical structure ...
9. Figure 4.9 Role of prior experience in reconstruction with continuity. (a) T...
10. Figure 4.10 Some other principles of perceptual grouping. (a) Grouping by (g...
11. Figure 4.11 Foreground–background separation. (a) Closed regions (black) are...
12. Figure 4.12 An example of visual bistability. “My wife and my mother-in-law....
13. Figure 4.13 Utility of context in perception. (a,b) Objects without context....
14. Figure 4.14 Vision as a synthesis of top-down and bottom-up processes.
15. Figure 4.15 A hierarchical Bayesian network depicting for human body pose es...
Chapter 5
1. Figure 5.1 Where people look at. (a) Example Image. (b) Attention heat map....
2. Figure 5.2 Examining a picture “A room with a family with an unexpected visi...
3. Figure 5.3 Computation model for attention with feature integration theory....
4. Figure 5.4 Contextual attention model.
5. Figure 5.5 Fixation density maps for a few images created by different labor...
Chapter 6
1. Figure 6.1 Taxonomy of long-term memory.
2. Figure 6.2 Short-term memory and interactions.
3. Figure 6.3 A taxonomy of cognitive architectures based on their modeling par...
4. Figure 6.4 A taxonomy of cognitive architectures based on their perceptual c...
5. Figure 6.5 A detailed view of the STAR architecture.
6. Figure 6.6 LIDA cognitive cycle.
Chapter 7
1. Figure 7.1 MRI image of brain.
2. Figure 7.2 A simple semantic network.
3. Figure 7.3 An example of frame-based knowledge representation.
4. Figure 7.4 Similarity network for some pieces of furniture, based on Minsky ...
5. Figure 7.5 Semiotic triangle.
6. Figure 7.6 Knowledge in perceptual and symbolic forms. (a) Coexistence of sy...
7. Figure 7.7 Concepts and their perceptual representations.
8. Figure 7.8 Representations of scene in terms of contained objects. (a) A stu...
9. Figure 7.9 Allen's temporal relations.
10. Figure 7.10 Ambiguity with Allen's relation in 2D space. (a) Regions ...
11. Figure 7.11 Containment relations shown in two-dimensional space.
12. Figure 7.12 Semantics of Allen's relations, based on Freksa (1992).
13. Figure 7.13 Representing fuzzy relations.
14. Figure 7.14 Layers of visual data processing, based on Aditya et al. (2019b)...
Chapter 8
1. Figure 8.1 Organization of a neural network
2. Figure 8.2 Convolutional Neural Networks (CNN). (a) Input volume connected t...
3. Figure 8.3 Correspondence of spatial extent of image planes and feature plan...
4. Figure 8.4 Block diagram for recurrent neural network.
5. Figure 8.5 Siamese network architecture.
6. Figure 8.6 Network architectures for image segmentation. (a) SegNet. (b) Con...
7. Figure 8.7 Architecture for transfer learning.
8. Figure 8.8 Incremental learning. Based on Xiao et al. (2014).
9. Figure 8.9 Block diagrams for multi-task learning. (a) Hard parameter sharin...
10. Figure 8.10 SALICON model.
11. Figure 8.11 Architecture for attention-based RNN (AB-RNN).
12. Figure 8.12 Recurrent attention model of STAR-FC.
Chapter 9
1. Figure 9.1 Restricted sRGB space (the triangle) for imaging devices against ...
2. Figure 9.2 Color enhancement. (a) Original image. (b) Image with color enhan...
3. Figure 9.3 Results of digital in-painting for removing scratches. (a) Images...
4. Figure 9.4 An ontology for Indian classical dance.
5. Figure 9.5 Steps to attain joint attention, based on Huang and Thomaz (2010)...
6. Figure 9.6 Distribution of encoding cost for some natural scenes: (a) natura...
Chapter 10
1. Figure 10.1 Many views of cognitive vision

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer ofWarranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information aboutWiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Names: Ghosh, Hiranmay, author.

Title: Computational models for cognitive vision / Hiranmay Ghosh.

Description: Hoboken, New Jersey : Wiley-IEEE Computer Society Press, [2020] | Includes bibliographical references and index.

Identifiers: LCCN 2020003784 (print) | LCCN 2020003785 (ebook) | ISBN 9781119527862 (paperback) | ISBN 9781119527855 (adobe pdf) | ISBN 9781119527893 (epub)

Subjects: LCSH: Computer vision. | Cognitive science. | Visual perception. | Bayesian statistical decision theory.

Classification: LCC TA1634 .G483 2020 (print) | LCC TA1634 (ebook) | DDC 006.3/7–dc23

LC record available at https://lccn.loc.gov/2020003784
LC ebook record available at https://lccn.loc.gov/2020003785

Cover Design: Wiley
Cover Image: © Andriy Onufriyenko/Getty Images

1
Introduction

Human vision system (HVS) has a remarkable capability of building three-dimensional models of the environment from the visual signals received through the eyes. The goal of computer vision research is to emulate this capability on man-made apparatus, such as computers. Twentieth century saw a tremendous growth in the field of computer vision. Starting with signal processing techniques for demarcating objects in space-time continuum of visual signals, the field has embraced several other disciplines like artificial intelligence and machine learning for interpreting the visual contents. As the research in computer vision matured, it has been pushed to address several real-life problems toward the turn of the century. Examples of such challenging applications include visual surveillance, medical image analysis, computational photography, digital heritage, robotic navigation, and so on.

Though computer vision has shown extremely promising results in many of applications in restricted domains, its performance lags that of HVS by a large margin. While HVS can effortlessly interpret complex scenes, e.g. those shown in Figure 1.1, artificial vision fails to do so. It is “intuitive” for humans to comprehend the semantics of the scenes at multiple levels of abstraction, and to predict the next movements with some degree of certainty. Derivation of such semantics remains a formidable challenge for artificial vision systems. Further, many real-life applications demand analysis of imperfect imagery, for example with poor lighting, blur, occlusions, noise, background clutter, and so forth. While human vision is robust to such imperfections, computer vision systems often fail to perform in such cases. These revelations motivated deeper study of HVS and to apply the principles involved into computer vision applications.

Photographs depicting hard challenges for computer vision. (Left) “The offensive player … is about to shoot the ball at the goal …”. (Right) A facial expression in Bharatnatyam dance. — **Figure 1.1** Hard challenges for computer vision. (a) “The offensive player is about to shoot the ball at the goal ” (b) A facial expression in Bharatnatyam dance.

*Source*: File shared by Rick Dikeman through Wikimedia Commons, file name: Football_iu_1996.jpg.

*Source*: File shared by Suyash Dwivedi through Wikimedia Commons, file name: Bharatnatyam_different_facial_expressions_(9).jpg.

images — **Figure 1.1** Hard challenges for computer vision. (a) “The offensive player is about to shoot the ball at the goal ” (b) A facial expression in Bharatnatyam dance.

*Source*: File shared by Rick Dikeman through Wikimedia Commons, file name: Football_iu_1996.jpg.

*Source*: File shared by Suyash Dwivedi through Wikimedia Commons, file name: Bharatnatyam_different_facial_expressions_(9).jpg.

1.1 What Is Cognitive Vision

Though there is a broad agreement in the scientific community that cognitive vision pertains to application of principles of biological (especially, human) vision systems to computer vision applications, the space of cognitive vision studies are not well defined (Vernon 2006). The boundary between vision and cognition is thin, and cognitive vision operates in that gray area. Broadly speaking, cognitive vision involves the ability to survey a visual scene, recognizing and locating objects of interest, acting based on visual stimuli, learning and generation of new knowledge, dynamically updating a visual map that represents the reality, and so on. Perception and reasoning are two important pillars on which cognitive vision stands. A crucial point is that the entire gamut of activities must be in real-time to enable an agent to engage with the real world. It is an emerging area of research integrating methodologies from various disciplines like artificial intelligence, computer vision, machine learning, cognitive science, and psychology. There is no single approach to cognitive vision, and the proposed solutions to the different problems appears like islands in an ocean. In this book, we have attempted to put together computational theories for a set of cognitive vision problems and organized it in an attempt to develop a coherent narrative for the subject. We shall get more insight on what cognitive vision is as we proceed through the book, and shall characterize it in clearer terms in Chapter 10.

1.2 Computational Approaches for Cognitive Vision

Two branches of science have significantly contributed to the understanding of the processes for cognition from visual as well as other sensory signals. One of them is psychophysics, which is defined as the “study of quantitative relations between psychological events and physical events or, more specifically, between sensations and the stimuli that produce them” (Encyclopedia Britannica). The subject was established by Gustav Fechner and is a marriage between study of sensory processes and physical stimuli. The other branch of science that has facilitated our understanding of perception and cognition is neurophysiology, which combines physiology and neural sciences for an understanding of the functions of the nervous system. The two approaches are complementary to each other. While psychophysics answers what happens during cognition, neurophysiology explains how it is realized in the biological nervous system.

Researchers on cognitive vision have for long recognized it as an information processing activity by the biological neural system. However, a formal computational approach to understand cognition has been a fundamental contribution by David Marr (1976). Marr abstracted vision into three separable layers, namely (i) hardware, (ii) representation and algorithms, and (iii) computational theory. This abstraction enables computational theories of cognitive vision to be formulated independent of implementations in biological vision system. It also provides a theory for realizing cognitive functions in artificial systems made up of altogether different hardware, and possibly using different representations and algorithms. Further, Marr's model of vision assumes modularity and pipelined architecture, two important properties of information processing systems that allow independent formulation of the different cognitive processes with defined interfaces. Marr identifies three stages of processing for vision. The first involves finding the basic contours that mark the object boundaries. The second stage results in discovery of the surfaces and their orientations, that results in an observer-centric images -dimensional model. The third involves knowledge-based interpretation of the model to an observer-neutral set of objects that constitute the 3D environment. These three stages roughly correspond to the early vision, perception, and cognition stages of vision, as recognized in the modern literature, and which we shall describe shortly.

As suggested by David Marr, it is possible to study computational theories of cognitive vision in isolation from the biological systems, and we propose to do exactly the same in this book. However, such computational models need to explain the what part of cognition. For that purpose, we shall refer to the results of the psychophysical experiments, wherever relevant, without going into details of the experimental setups. Further, though the goal of computational modeling is to support alternate (artificial) implementations of cognition that need not be based on biological implementation models, analysis of the latter often provides clue to plausible implementation schemes. We shall discuss the results of some relevant neurophysiological studies in the book. We shall consciously keep such discussions to a superficial level, so that the text can be followed without a deep knowledge of either psychology or neurosciences.

1.3 A Brief Review of Human Vision System

We briefly look into how human vision works in this section, in order to put rest of the text in this book in context. A broad overview of HVS is presented in Figure 1.2. It comprises a pair of eyes connected to the brain via the optic nerves. When one looks at a scene, the light rays enter the eyes to form a pair of inverted images on screens at the back of the eyes, which are known as the retina. This corresponds to mapping of the external 3D world to a pair of 2D images, with slightly different perspectives. Internal representations of the images are transmitted to the visual cortex in the rear end of the brain by a bunch of optic nerves, where the images are correlated and interpreted to reconstruct a symbolic description of the 3D world.

In this simple model of biological vision, the eyes primarily act as image capture device in the system, and the brain as the interpreter. In reality, things are much more complex. The output from the eyes is not a faithful reproduction of the images received. Significant transformations takes place on the retina, which enables efficient identification of object contours and their movements. These transformations are collectively referred to as early vision. Further processing in the neural circuits of the brain that results in interpretation of the signals received from the eye is known as late vision. The goal of late vision is to establish what and where of the objects located in the scene. It is believed that there are two distinct pathways in human brain, ventral and dorsal, through which visual information is processed, to answer these two aspects of vision (Milner and Goodale 1995). This has been emulated in several artificial vision systems, as we shall see in the following chapters of this book.

One of the initial tasks in the late vision system is to correlate the images received from the two eyes, which is facilitated by the criss-cross connection of the optic nerves connecting the eyes with the brain. Further, the late vision system achieves progressive abstraction of the correlated images and leads to perception and cognition, which we discuss in some details in Section 1.4.

A broad overview of human vision system comprising a pair of eyes connected to the brain via the optic nerves, depicting a scene where the light rays enter the eyes to form a pair of inverted images on screens at the back. — **Figure 1.2** An overview of human vision system.

*Source*: Derivative work from file shared by Wiley through Wikimedia Commons, file name: Wiley_Human_Visual_System.gif.

1.4 Perception and Cognition

The first step in interpreting retinal images involves organization of visual data, the isolated patterns on the retina, to create a coherent interpretation of the environment. This stage is known as perception. Though we shall focus on visual perception in this book, biological perception generally results in a coordinated organization of inputs from all sensory organs. For example, the human beings create a coordinated interpretation of visual and haptic signals while grabbing an object. For an artificial agent, for example a driver-less car, perception involves all the sensors that it is equipped with. In philosophical terms, perception is about asserting a truth about the environment by processing sensory data. However, the “assertion” by an agent can be different from the reality, e.g. a vehicle seen through the convex side-view mirrors of a car may be perceived to be farther than it actually is. Such “erroneous” perceptions often lead to illusions, some of which we shall discuss in Chapters 2 and 4 of this book. Some authors prefer to include a capability to respond to the percepts in connotation of perception.

Cognition refers to an experiential interpretation of the percepts. It involves reasoning about the properties of percepts with the background knowledge and experience that an agent possesses. Depending of the knowledge-level of the agent, there can be many different levels of interpretation for the percepts. For example, Figure 1.1b can be interpreted in many ways with progressive levels of abstraction, such as a human face, a classical dance form, or an emotion expressed. Cognition may also result in “correcting” the erroneous perceptions, using specific domain knowledge. For example, the knowledge of the properties of a convex mirror results in a more realistic estimate of the distance of an object seen through a side-view mirror of a car. Cognition involves the intentional state of an agent as well. For example, while driving an automobile, a driver analyzes the visual (and audio) percepts with an objective of reaching the destination while ensuring safety and being compliant the traffic rules. In the process, the driver may focus on the road in the front, and the traffic lights, ignoring other percepts, such as the signage on the shop-fronts bordering the street. Such selective filtering of sensory data is known as attention. It is necessary to prevent the cognitive agent to be swamped with huge volume of information that it cannot process.

Thus, we find that cognition involves not only interpretation of the sensory signals but also many other factors, such as intention, knowledge, attention, and memory of an agent. Moreover, the knowledge of a cognitive agent needs to be continuously updated for it to adapt to a new environment and to respond to yet unforeseen situations. For example, while driving on a hilly road, a city-driver needs to quickly learn the specific skills for hill driving to ensure a safe journey. The process through which the knowledge is updated is called learning, and is a critical requirement for a real-life agent.

A simplified process model in a cognitive system depicting the percepts filtered through attention mechanism, entering the cognitive interpretation resulting in signals to control further data acquisition and perception. — **Figure 1.3** A simple process model in a cognitive system.

The fundamental difference between perception and cognition is that the former results in acquisition of new information through sensory organs, while the latter is the process of experiential analysis of the acquired information with some intention. There is, however, a strong interaction between these two processes. Percepts, filtered through attention mechanism, enters the cognitive process. On the other hand, cognitive interpretation of percepts results in signals to control further data acquisition and perception. This ensures need-based just-in-time visual data collection based on the intention of a cognitive agent, which is also known as active vision. Moreover, discovery of new semantic patterns through the process of cognition leads to update in the knowledge store of an agent. A simplified process model in a cognitive system is shown in Figure 1.3.

1.5 Organization of the Book

The characterization of cognitive vision and its various stages presented above, sets the background of the rest of this book. We begin with early vision system in Chapter 2, where we describe the transformations that an image goes through by the actions of the neural cells on the retina. In Chapter 3, we introduce Bayesian reasoning framework, which will be used to explain many of the perceptual and cognitive processes in the later chapters. We explain several perceptual and cognitive processes in Chapter 4. Chapter 5 deals with visual attention, the gateway between the world of perception and the world of cognition.

While the earlier chapters describe the individual processes of perception and cognition, they need to be integrated in an architectural framework for realization of cognitive systems. We characterize cognitive architectures, discuss their generic properties, and review a few popular and contemporary architectures as examples, in Chapter 6. While the architectures provide generic cognitive capabilities and interaction with the environment, we focus on the functions for cognitive vision in these architectures. Knowledge is a crucial ingredient of a cognitive system, and we introduce classical approaches to its representation in Chapter 7.

There is huge corpus of recent research that attempts to emulate the biological vision system with artificial neural networks and aims to learn the cognitive processes with deep learning techniques. A discourse on cognitive vision cannot be complete without them. We present a cross-section of the research in Chapter 8. In this chapter, we elaborate on the various modes of learning capabilities that a real-life agent need to possess, and that have been realized with deep learning techniques.

We discuss a few real-life applications for visual cognition in Chapter 9 and illustrate the use of the principles of cognitive vision. In Chapter 10, we take a look through a rear-view mirror to review what we studied, which enables us to characterize cognitive vision in more concrete terms. Further, we compare the two complementary paradigms of cognition, namely classicist and connectionist approaches, and discuss a possible synergy between the two that may be on the anvil.

Finally, a few words about the content of the book. Computational theories of cognitive vision is a vast subject, and it is not possible to cover all of it in the extent of one book. I have tried to condense as much of information as possible in this book, without sacrificing understandability, and have provided ample number of references for interested readers to explore the subject further. While providing the citations, I have given preference to authentic reviews and tutorials that should enable a reader to get an overview of the subject, and which may lead an inquisitive reader to many other relevant research publications. Also, cognitive vision being a rapidly evolving subject, I have tried to cover as much as recent material as possible, without ignoring classic text in this subject. Though I focus on cognitive vision, many of the principles of perception and cognition discussed in the book is not exclusive to visual system alone, but holds good for other sensory mechanisms as well.

Jón Atli Benediktsson	David Alan Grier	Elya B. Joffe
Xiaoou Li	Peter Lian	Andreas Molisch
Saeid Nahavandi	Jeffrey Reed	Diomidis Spinellis
Sarah Spurgeon	Ahmet Murat Tekalp

AB-RNN	attention-based recurrent neural network
AGC	automatic gain control
AGI	artificial general intelligence
AI	artificial intelligence
AIM	attention-based information maximization
ANN	artificial neural network
AUC-ROC	area under the curve – receiver operating characteristics
AVC	advanced video coding
BN	Bayesian network
BNN	Bayesian neural network
BRNN	bi-directional recurrent neural network
CAM	content addressable memory
CIE	international commission on illumination
CNN	convolutional neural network
CP	Cognitive Program
CR	consequential region
CRF	conditional random field
CSM	current situational model
DBN	dynamic Bayesian network
DL	deep learning
DL	description logics
DNN	deep neural network
DoG	difference of Gaussians
FCNN	fully convolutional neural network
FDM	fixation density map
FoL	first order logic
FR	full reference (visual quality assessment)
GAN	generative adversarial network
GNN	graph neural network
GPU	graphics processing unit
GW	global workspace
HBM	hierarchical Bayesian model
HBN	hierarchical Bayesian network
HDR	high dynamic range
HMM	hidden Markov model
HRI	human–robot interaction
HSE	human social environment
HSV	hue-saturation-value
HVS	human vision system
ICD	Indian classical dance
ICH	intangible cultural heritage
KLD	Kullback–Leibler divergence
LDA	latent Dirichlet allocation
LDR	low dynamic range
LIDA	Learning Intelligent Distribution Agent
LoG	Laplacian of Gaussian
LOTH	language of thought hypothesis
LSTM	long short term memory
LTM	long-term memory
MCMC	Markov chain Monte Carlo
MEBN	multi-entity Bayesian network
MOWL	Multimedia Web Ontology Language
MSE	mean square error
MTL	multitask learning
NLP	natural language processing
NR	no reference (visual quality assessment)
NSS	natural scene statistics
NTM	neural turing machine
OWL	web ontology language
PAM	perceptual associative memory
PLCC	Pearson linear correlation coefficient
PSNR	peak signal to noise ratio
RAM	recurrent attention model
RBS	rule based systems
RGB	red–green–blue
RNN	recurrent neural network
RR	reduced reference (visual quality assessment)
RTM	representational theory of mind
SALICON	SALIency in CONtext
SGP	symbol grounding problem
SLAM	simultaneous localization and mapping
SMC	sensori-motor contingencies
SSIM	structural similarity index measure
ST	selective tuning (attention model)
STAR	selective tuning attentive reference
STM	short term memory
SURF	speeded up robust features
SWRL	semantic web rule language
TCS	TATA Consultancy Services
VQA	visual query answering
W3C	World-Wide Web Consortium
WTA	winner take all