US 11,954,521 B2
Deep learning job scheduling method and system and related device
Jian Lin, Shenzhen (CN); Jie Yang, Hangzhou (CN); and Sibao Hong, Hangzhou (CN)
Assigned to HUAWEI CLOUD COMPUTING TECHNOLOGIES CO., LTD., Guizhou (CN)
Filed by Huawei Cloud Computing Technologies Co., Ltd., Guizhou (CN)
Filed on Sep. 30, 2020, as Appl. No. 17/038,720.
Application 17/038,720 is a continuation of application No. PCT/CN2019/078533, filed on Mar. 18, 2019.
Claims priority of application No. 201810276336.9 (CN), filed on Mar. 30, 2018; and application No. 201810414039.6 (CN), filed on May 2, 2018.
Prior Publication US 2021/0011762 A1, Jan. 14, 2021
Int. Cl. G06F 9/48 (2006.01); G06F 9/50 (2006.01); G06N 3/08 (2023.01); G06N 20/00 (2019.01)
CPC G06F 9/4881 (2013.01) [G06F 9/5083 (2013.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A deep learning job scheduling method, comprising:
obtaining a job request of a deep learning job comprising a deep learning library type and a job type;
determining a target job description file template from a plurality of pre-stored job description file templates based on the deep learning library type and the job type;
determining an identifier of a target job basic image from identifiers of a plurality of pre-stored job basic images based on the deep learning library type and the job type, wherein the pre-stored job basic images comprise an image of a deep learning library, an image of a dependency library, and an image of a deep learning program;
generating a target job description file based on the target job description file template and the identifier;
sending the target job description file to a container scheduler;
selecting, by the container scheduler, the target job basic image from the pre-stored job basic images based on the target job description file; and
creating a container for executing the job request.