-
1.
公开(公告)号:US20240112330A1
公开(公告)日:2024-04-04
申请号:US18220743
申请日:2023-07-11
Applicant: NORTHWESTERN POLYTECHNICAL UNIVERSITY
Inventor: CHEN XIA , HEXU CHEN , JUNWEI HAN , LEI GUO , KUAN LI , CHI ZHANG , ZHIHONG XU
IPC: G06T7/00 , A61B5/16 , G06T7/73 , G06T7/80 , G06V10/26 , G06V10/77 , G06V10/80 , G06V10/82 , G06V20/40 , G06V40/16 , G06V40/18 , G06V40/20
CPC classification number: G06T7/0012 , A61B5/163 , A61B5/168 , G06T7/74 , G06T7/80 , G06V10/26 , G06V10/7715 , G06V10/811 , G06V10/82 , G06V20/46 , G06V40/171 , G06V40/176 , G06V40/193 , G06V40/20 , A61B2503/06 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30041 , G06T2207/30201
Abstract: The present invention relates to a method for screening mobile terminal visual attention abnormalities in children based on multimodal data learning. A calibration video and a testing video are set up, and a head-face video of children while watching the calibration video and the testing video on smartphones is recorded, respectively. An eye-tracking estimation model is constructed to predict the fixation point location from the head-face video corresponding to the testing video frame by frame and to extract the eye-tracking features. Facial expression features and head posture features are extracted. A Long Short-Term Memory (LSTM) network is used to fuse different modal features and realize the mapping from multimodal features to category labels. In the testing stage, the head-face video of children to be classified while watching the videos on smartphones is recorded, and the features are extracted and input into the post-training model to determine whether they are abnormal.