Deep learning is a hot topic in machine learning and it is playing increasingly important roles in intelligence systems for our daily lives, such as computer vision, autonomous car driving, earth observation, etc. Deep learning is typically data driven, there is little domain knowledge utilised in training neural network models and determining decision boundaries. Learning process mainly focuses on searching and optimizing network models by gradient descent functions, without accounting for inherent discriminant and dependent characteristics of data distribution. Such a learning paradigm often makes it more difficult to achieve the convergence of training models, and therefore the resulting network models are not easy to generalize, typically in machine translation. These observations have motivated the recent development of deep Transformer Networks.
Transformer Networks make extensive use of attention mechanisms to discriminate the representative parts of data distribution based on contextual information and fade out the rest, thereby devoting more learning processes to deal with that small but representative part of the data for classification, recognition or prediction tasks. Currently ‘the two most common attention techniques used are dot-product attention, which uses the scalar product between vectors to highlight important parts of data, and multi-head attention, which uses combined attention to direct the overall attention of a network or sub-network’.
The proposed project will study existing attention techniques and develop new attention mechanisms that will be used as heuristics in a transformer network model. The project will incorporate one of two application scenarios into the development of attention: 1) detecting abnormal change from images, and 2) anomaly detection in electromagnetic satellite signals, which are part of the ongoing work undertaken in the supervisory team. For the former case, the development of an attention mechanism could be inspired by human visual attention, in which humans often focus on an object of interest in an image with high resolution while perceiving the surrounding of the object at low resolution over time. Whereas for the latter case, as electromagnetic satellite signals might contain a range of events, not all parts of signals are equally relevant for a specific event. This observation could motivate development of an attention based on the contextual characteristics of an event to determine the relevant portions of signals.
The transformer networks developed may be applicable to various real world application scenarios, such as monitoring abnormal singale change in Earth Observation, detecting faults in assemble lines of manufacturing processes, or human’s health deterioration in the context of healthcare.