US 12,148,248 B2
Ensemble deep learning method for identifying unsafe behaviors of operators in maritime working environment
Xinqiang Chen, Shanghai (CN); Zichuang Wang, Shanghai (CN); Yongsheng Yang, Shanghai (CN); Bing Han, Shanghai (CN); Zhongdai Wu, Shanghai (CN); Chenxin Wei, Shanghai (CN); Huafeng Wu, Shanghai (CN); and Yang Sun, Shanghai (CN)
Filed by Shanghai Maritime University, Shanghai (CN); and Shanghai Ship and Shipping Research Institute, Shanghai (CN)
Filed on May 18, 2022, as Appl. No. 17/747,946.
Prior Publication US 2023/0222841 A1, Jul. 13, 2023
Int. Cl. G06V 40/20 (2022.01); G06V 10/62 (2022.01); G06V 10/77 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01); G06V 20/40 (2022.01); G06V 20/52 (2022.01)
CPC G06V 40/20 (2022.01) [G06V 10/62 (2022.01); G06V 10/7715 (2022.01); G06V 10/806 (2022.01); G06V 10/82 (2022.01); G06V 20/46 (2022.01); G06V 20/49 (2022.01); G06V 20/52 (2022.01)] 5 Claims
OG exemplary drawing
 
1. An ensemble deep learning method for identifying unsafe behaviors of operators in maritime working environment, comprising the following steps:
(1) extracting features of maritime images: inputting a surveillance video with one or a multitude of the operators operating a multitude of devices in a maritime working environment, and decomposing the surveillance video into the maritime images, extracting the features from the maritime images based on a YOLO V3 detection model, and then constructing a feature pyramid structure within the YOLO V3 detection model;
(2) retrieving spatial-temporal interaction information of the operators and the devices: extracting instance-level features of the operators and the devices in the maritime images with a JDE paradigm, and meanwhile storing time memory features of the operators in the maritime images;
(3) updating a feature memory pool: transferring spatial-temporal interaction information linked to unsafe behaviors of the operators into the feature memory pool, and then updating the time memory features of the operators in the maritime images with an asynchronous memory updating algorithm;
(4) identifying the unsafe behaviors from the maritime images: establishing a spatial-temporal interaction relationship among the operators, the devices, and the unsafe behaviors by building an asynchronous interaction aggregation network, and identifying the unsafe behaviors in the maritime images with the spatial-temporal interaction relationship;
wherein step (2) comprises the following steps:
(2.1) predicting prediction head information of the maritime images with the JDE paradigm: modeling a united learning function of the prediction head information as a multi-task learning problem with the JDE paradigm, and indicating the united learning function as a weighted linear loss sum of components, the components contain classification information, regression information, and embedding information; predicting the prediction head information with the united learning function being shown in Eq. (3):
Lunitek=1MΣj=(a,b,c)wjkLjk  (3)
wherein M denoting the number of the types of the prediction head information, Ljk, k=1, . . . , M, j=a, b, c being a loss function corresponding to the different types of the components, and wjk, k=1, . . . , M, j=a, b, c being a weight coefficient of the loss function;
(2.2) learning a loss weight automatically: determining the loss weight automatically based on task-dependence uncertainty, and then obtaining the classification information, the regression information, and the embedding information, with expression of the function of learning the loss weight automatically being shown as Eq. (4):

OG Complex Work Unit Math
wherein rjk, k=1, . . . , M, j=a, b, c representing the task-dependence uncertainty of each of the loss function;
(2.3) retrieving the spatial-temporal interaction information in the maritime images: splitting a maritime surveillance video into video clips, extracting the instance-level features of the operators and the devices in the maritime images with the united learning function, storing the time memory features of the operators; retrieving the spatial-temporal interaction information from the maritime images, and the spatial-temporal interaction information contain the instance-level features and the time memory features.