US 11,809,429 B2
Method for processing model parameters, and apparatus
Cheng Chen, Beijing (CN); Peng Zhao, Beijing (CN); Di Wu, Beijing (CN); Junyuan Xie, Beijing (CN); Chenliaohui Fang, Beijing (CN); Longyijia Li, Beijing (CN); Long Huang, Beijing (CN); Liangchao Wu, Beijing (CN); Long Chang, Beijing (CN); Lizhe Zhang, Beijing (CN); Yixiang Chen, Beijing (CN); and Xiaobing Liu, Beijing (CN)
Assigned to BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Aug. 12, 2022, as Appl. No. 17/886,746.
Application 17/886,746 is a continuation of application No. PCT/CN2021/080876, filed on Mar. 15, 2021.
Claims priority of application No. 202010269954.8 (CN), filed on Apr. 8, 2020.
Prior Publication US 2023/0023253 A1, Jan. 26, 2023
Int. Cl. G06F 7/00 (2006.01); G06F 16/00 (2019.01); G06F 16/2455 (2019.01); G06F 9/54 (2006.01)
CPC G06F 16/24552 (2019.01) [G06F 9/547 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method of optimizing parameter storage of large-scale feature embedding for machine learning models, comprising:
obtaining a model parameter set, wherein the model parameter set comprises a multi-dimensional array corresponding to a feature embedding, wherein the obtaining a model parameter set further comprises:
obtaining a model file,
performing analysis on the model file to obtain an analysis graph corresponding to the model file, wherein the analysis graph comprises variables and operations on the variables,
extracting a variable from the analysis graph as a target variable based on a preset field, and
obtaining the model parameter set based on the feature embedding corresponding to the target variable, wherein the obtaining the model parameter set based on the feature embedding corresponding to the target variable further comprises:
obtaining a data amount of the multi-dimensional array corresponding to the feature embedding corresponding to the target variable, and
in response to determining that the obtained data amount is greater than a preset data amount threshold, obtaining the multi-dimensional array corresponding to the feature embedding corresponding to the target variable as the model parameter set;
obtaining attribute information of a storage system storing the model parameter set, wherein the storage system storing the model parameter set is different from a system on which a model corresponding to the model parameter set operates; and
storing the model parameter set in the storage system based on the attribute information.