Researchers at the Japan Advanced Institute of Science and Technology have integrated biological signals with machine learning methods to enable “emotionally intelligent” AI. Emotional intelligence could lead to more natural human-machine interactions, the researchers say.
The new study was published in the journal IEEE Transactions on Affective Computing.
Achieving Emotional Intelligence
Speech and language recognition technologies like Alexa and Siri are constantly evolving, and the addition of emotional intelligence could take them to the next level. This would mean these systems could recognize the emotional states of the user, as well as understand language and generate more empathetic responses.
“Multimodal sentiment analysis” is a group of methods making up the gold standard for AI dialog systems with sentiment detection, and they can automatically analyze a person’s psychological state from their speech, facial expressions, voice color, and posture. They are fundamental to creating human-centered AI systems and could lead to the development of an emotionally intelligent AI with “beyond-human capabilities.” These capabilities would help the AI understand the user’s sentiment before forming an appropriate response.
Analyzing Unobservable Signals
Current estimation methods focus mostly on observable information, which leaves out information in unobservable signals, which can include physiological signals. These types of signals hold a lot of valuable data that could improve sentiment estimation.
In the study, physiological signals were added to multimodal sentiment analysis for the first time. The team of researchers that undertook this study included Associate Professor Shogo Okada from Japan Advanced Institute of Science and Technology (JSAIT), and Prof. Kazunori Komatani from the Institute of Scientific and Industrial Research at Osaka University.
“Humans are very good at concealing their feelings,” Dr. Okada says. “The internal emotional state of a user is not always accurately reflected by the content of the dialog, but since it is difficult for a person to consciously control their biological signals, such as heart rate, it may be useful to use these for estimating their emotional state. This could make for an AI with sentiment estimation capabilities that are beyond human.”
The team’s study involved the analysis of 2,468 exchanges with a dialog AI obtained from 26 participants. With this data, the team could estimate the level of enjoyment experienced by the user during the conversation.
The user was then asked to assess how enjoyable or boring the conversation was. The multimodal dialogue data set called “Hazumi1911” was used by the team. This data set combines speech recognition, voice color sensors, posture detection, and facial expression with skin potential, which is a form of physiological response sensing.
“On comparing all the separate sources of information, the biological signal information proved to be more effective than voice and facial expression,” Dr. Okada continued. “When we combined the language information with biological signal information to estimate the self-assessed internal state while talking with the system, the AI’s performance became comparable to that of a human.”
The new findings suggest that the detection of physiological signals in humans could lead to highly emotional intelligent AI-based dialog systems. Emotionally intelligent AI systems could then be used to identify and monitor mental illness by sensing changes in daily emotional states. Another possible use case is in education, where they could identify whether a learner is interested in a topic or bored, which could be used to alter teaching strategies.