| CPC G06V 10/82 (2022.01) [G06N 3/0455 (2023.01)] | 16 Claims |

|
1. A method for determining an architecture of an encoder of a convolutional neural network, the neural network being configured to process multiple different image processing tasks, the method comprising:
for each image processing task, calculating a characteristic scale distribution based on training data, the characteristic scale distribution indicating a size distribution of objects to be detected by the respective image processing task;
generating multiple encoder architecture candidates, each encoder architecture of the encoder architecture candidates comprising at least one shared encoder layer which provides computational operations for multiple image processing tasks and multiple branches which span over one or more encoder layers which provide at least partly different computational operations for the image processing tasks, wherein each branch is associated with a certain image processing task;
calculating receptive field sizes of the encoder layers of the multiple encoder architectures;
calculating multiple assessment measures, each assessment measure referring to a combination of a certain encoder architecture of the multiple encoder architectures and a certain image processing task, each assessment measure including information regarding the quality of matching of characteristic scale distribution of the image processing task associated with the assessment measure to the receptive field sizes of the encoder layers of the encoder architecture associated with the assessment measure;
comparing the calculated assessment measures and establishing a comparison result; and
selecting an encoder architecture based on the comparison result.
|
|
15. A computer program product for determining the architecture of an encoder of a convolutional neural network, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions being executable by a processor to cause the processor to execute a method comprising:
for each image processing task, calculating a characteristic scale distribution based on training data, the characteristic scale distribution indicating a size distribution of objects to be detected by the respective image processing task;
generating multiple encoder architecture candidates, each encoder architecture of the encoder architecture candidates comprising at least one shared encoder layer which provides computational operations for multiple image processing tasks and multiple branches which span over one or more encoder layers which provide at least partly different computational operations for the image processing tasks, wherein each branch is associated with a certain image processing task;
calculating receptive field sizes of the encoder layers of the multiple encoder architectures;
calculating multiple assessment measures, each assessment measure referring to a combination of a certain encoder architecture of the multiple encoder architectures and a certain image processing task, each assessment measure including information regarding the quality of matching of characteristic scale distribution of the image processing task associated with the assessment measure to the receptive field sizes of the encoder layers of the encoder architecture associated with the assessment measure;
comparing the calculated assessment measures and establishing a comparison result; and
selecting an encoder architecture based on the comparison result.
|