US 11,989,630 B2
Secure multi-user machine learning on a cloud data platform
Monica J. Holboke, Toronto (CA); Justin Langseth, Kailua, HI (US); Stuart Ozer, Oakland, CA (US); and William L. Stratton, Jr., Atlanta, GA (US)
Assigned to Snowflake Inc., Bozeman, MT (US)
Filed by Snowflake Inc., Bozeman, MT (US)
Filed on Jan. 31, 2023, as Appl. No. 18/162,697.
Application 18/162,697 is a continuation of application No. 18/055,248, filed on Nov. 14, 2022, granted, now 11,893,462.
Application 18/055,248 is a continuation of application No. 17/644,732, filed on Dec. 16, 2021, granted, now 11,501,015.
Application 17/644,732 is a continuation of application No. 17/232,859, filed on Apr. 16, 2021, granted, now 11,216,580.
Claims priority of provisional application 63/160,306, filed on Mar. 12, 2021.
Prior Publication US 2023/0169407 A1, Jun. 1, 2023
Int. Cl. G06N 3/00 (2023.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 18/214 (2023.01); G06F 21/62 (2013.01); G06N 20/00 (2019.01)
CPC G06N 20/00 (2019.01) [G06F 16/256 (2019.01); G06F 16/283 (2019.01); G06F 18/214 (2023.01); G06F 21/6227 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A method comprising:
providing access to a user defined function to be executed in a database system to a first user of a cloud data platform, the user defined function generated by a second user of the cloud data platform, the user defined function comprising a machine learning model for training on a training dataset;
identifying, by at least one hardware processor, a request from the second user to train the machine learning model on a first training dataset and a second training dataset, the first training dataset being encrypted in the user defined function and the second training dataset including non-overlapping dataset features with the first training dataset;
enabling the user defined function to use training data from both the first user and the second user to train the machine learning model;
preventing training data from the first user from being revealed to the second user;
preventing training data from the second user from being revealed to the first user; and
generating one or more outputs from the trained machine learning model by applying the trained machine learning model on new data.