The global voice recognition market size is forecast to grow from 10.7 billion U.S. dollars in 2020 to 27.16 billion U.S. dollars by 2026 with a compound annual growth rate of 16.8 percent during the period . Moreover, all the main technology manufacturers, including Apple, Google, Xiaomi, Amazon, IBM, and Samsung, are already major players in producing voice communication devices. As voice over devices becoming norm, there is a need to decipher emotional aspects of conversations.
While sentiment scoring has been used to determine primary emotions of the users from these conversations, secondary emotions have largely been ignored as they are much more complex to detect, and they are often masked by the primary emotions. However, including secondary emotions is crucial to build systems that incorporate “human-like” behaviour. In addition to discerning facial expressions, it is imperative to incorporate emotional conversational analysis to reality technologies that are used by Robots, Voicebots, and Chatbots.
Using Robert Plutchik’s  emotional wheel, primary and secondary emotions can be evaluated to learn more about emotional aspects of a voice chat or conversation. Although speech recognition frameworks like natural-language understanding (NLU) and processing (NLP) are capable of elucidating user intent, they cannot decipher emotions. This research intends to use voice over chat conversations or conversations from voice devices to provide intelligent conversational analytics solutions for various applications leading to building smart notification systems. In particular, such system could be used to detect vulnerable users needing help, which could otherwise not have been identified by traditional digital behaviour analysis. Indeed, we identified this important limitation in preliminary research performed as part of an experimental study that can be adapted to reality technologies in everyday activities .
The aim of this project is to design a framework that uses machine learning algorithms on real-time data to determine emotional behaviour so that it can be used for various purposes like digital profiling, mental health awareness, and detecting threating, violent and abusive behaviour.
Successful completion of the project requires addressing the following scientific objectives:
1. Design and build an initial framework using traditional statistics and machine learning techniques (including bag of words) to decipher primary and secondary emotional behaviour from an individual’s voice
2. Refine the framework by integrating suitable deep learning architectures (such as recurrent neural networks) to increase success rate, especially in terms of secondary emotions
3. Build in a warning system able to detect inconsistency between user intent and anomalies in emotional behaviour
Applicants should have, at least, an Honours Degree at 2.1 or above (or equivalent) in Computer Science or related disciplines. In addition, they should have excellent programming skills in Python, and an interest in machine learning.