Context based position estimation of target of interest in videos
Abstract:
Target tracking in a video is a highly challenging problem as the target may be effected by its appearance changes along the video, partial occlusions, background clutter, illumination variations, surrounding environment and also due to changes in the motion of the target. Embodiments of the present disclosure address this problem by implementing neural network for convolution feature maps and their gradient maps generation. The proposed two-class neural network (TCNN) is guided by feeding it target of interest defined by a bounding box in a first frame of the video. With this target guidance TCNN generates target activation map by using convolutional features and gradient maps. Target activation map gives tentative location of target, and this is further exploited to locate target precisely by using correlation filter(s) and peak location estimator based on identified context. This process repeats for every frame of the video to track the target accurately.
Public/Granted literature
Information query
Patent Agency Ranking
0/0