US 12,243,292 B2
Systems for multi-task joint training of neural networks using multi-label datasets
Shuo Cheng, Los Angeles, CA (US); Wanchun Ma, Los Angeles, CA (US); and Linjie Luo, Los Angeles, CA (US)
Assigned to LEMON INC., Grand Cayman (KY)
Filed by Lemon Inc., Grand Cayman (KY)
Filed on Sep. 2, 2022, as Appl. No. 17/929,449.
Prior Publication US 2024/0078792 A1, Mar. 7, 2024
Int. Cl. G06K 9/62 (2022.01); G06N 3/0455 (2023.01); G06N 3/09 (2023.01); G06V 10/44 (2022.01); G06V 10/764 (2022.01); G06V 10/766 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01); G06V 10/778 (2022.01); G06V 10/82 (2022.01); G06V 10/96 (2022.01); G06V 40/16 (2022.01)
CPC G06V 10/774 (2022.01) [G06N 3/0455 (2023.01); G06N 3/09 (2023.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G06V 10/766 (2022.01); G06V 10/776 (2022.01); G06V 10/778 (2022.01); G06V 10/82 (2022.01); G06V 10/96 (2022.01); G06V 40/171 (2022.01); G06V 40/174 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computer system for multi-task joint training of a neural network including an encoder module and a multi-headed attention mechanism, the computer system comprising:
a processor coupled to a storage medium that stores instructions, which, upon execution by the processor, cause the processor to:
receive input data including a first set of labels and a second set of labels;
using the encoder module, extract features from the input data;
using a first task head of the multi-headed attention mechanism, compute a first training loss metric using the extracted features and the first set of labels;
using a second task head of the multi-headed attention mechanism, compute a second training loss metric using the extracted features and the second set of labels;
apply a first mask to filter the first training loss metric, wherein the first mask is computed based on the first set of labels;
apply a second mask to filter the second training loss metric, wherein the second mask is computed based on the second set of labels; and
compute a final training loss metric based on the filtered first training loss metric and the filtered second training loss metric.