A new machine-learning tool detects eye contact during recorded face-to-face interactions as accurately as expert observers can. The tool could help researchers and clinicians measure eye contact efficiently and objectively.
Eye contact is key to social interactions, and a tendency to avoid it can be one of the earliest signs of autism.
Researchers and clinicians often measure a child’s use of eye contact by manually noting instances of it during recorded interactions with an adult — a method that is time-consuming and subjective. Computer models can detect eye contact in videos more quickly but less accurately.
The new computer model, described in December in Nature Communications, is the first to achieve an accuracy comparable to that of expert observers — a feat made possible by using ‘deep learning,’ a subset of machine learning.
The team initially trained an algorithm to perform tasks related to recognizing eye contact, such as detecting the position of a person’s head and the direction of his gaze, using three public datasets. By way of ‘transfer learning,’ this step improves the final algorithm’s accuracy, the researchers say.
The researchers then fine-tuned the algorithm to detect eye contact specifically, in more than 4 million frames from recorded interactions — about twice as many frames as were used to train previous algorithms. The interactions involved 121 children — 66 of them with autism — aged 18 months to 5 years, and an adult, who wore a pair of glasses with an embedded camera to record the child’s face.
During each interaction, the child participated in two play-based assessments designed to elicit eye contact. In one test, for example, the examiner gives the child a wind-up toy only when the child makes eye contact.
A group of 10 trained raters watched videos of the interactions frame by frame and identified those in which the child appeared to be looking at the examiner’s eyes. The researchers trained the algorithm using 103 children’s videos, which were each scored by one rater. They then tested it on videos of the remaining 18 children, analyzed by multiple raters. The team judged a rater’s or the algorithm’s assessment of a frame to be correct when the majority of raters agreed.
Comparing the raters to the algorithm revealed that the latter is as accurate as the average rater. The algorithm also catches the vast majority of frames in which eye contact occurs.
The team also used the algorithm to replicate findings from two of their previous studies, in which trained raters had assessed eye contact in videos of autistic and non-autistic children and adolescents. The researchers were able to reproduce all of their original findings, statistical tests show, which suggests the computer model is suitable for research purposes.
“It actually provides the same quality of evidence for a scientific hypothesis as you got when you did it manually,” says James Rehg, professor of interactive computing at the Georgia Institute of Technology in Atlanta, who led the research.
The algorithm could be used in a variety of settings, such as measuring changes in eye contact in response to therapy, the researchers say. Code for the algorithm is available online.