1. Introduction Deep Neural Networks (DNNs) have been the de facto infrastructure in visual representation learning, which plays a pivotal role in the context of computer vision. Inspired by mammalian visual systems, Convolutional Neural Networks (CNNs), are first designed to capture neighborhood correlations hidden in observed images with convolutional inductive bias and nonlinear activations.