US 12,436,973 B2
Data tagging and prompt generation system
Jing-tao Li, Shaanxi (CN); Jian Song, Xi'an (CN); Ming Yan, Shaanxi (CN); Jingyuan Li, Xi'an (CN); and Bo Dang, Xi'an (CN)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on Oct. 25, 2023, as Appl. No. 18/383,557.
Prior Publication US 2025/0139128 A1, May 1, 2025
Int. Cl. G06F 16/00 (2019.01); G06F 16/2458 (2019.01); G06F 16/28 (2019.01)
CPC G06F 16/285 (2019.01) [G06F 16/2462 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, by one or more processors, input data comprising data to be tagged by a language model;
identifying metadata associated with the input data, wherein the metadata comprises a name by which to refer to the input data;
generating one or more statistics based on the input data, the one or more statistics comprising a total number of data items in the input data;
calculating a sample size for the input data based on the one or more statistics, wherein the sample size is less than the total number of data items in the input data;
extracting a sample of the input data in accordance with the sample size, wherein the sample of the input data comprises a subset of the input data;
generating a prompt based on a prompt template, the prompt template comprising an input segment comprising the metadata and the sample of the input data, and an output segment identifying a format for an output;
providing the prompt to the language model configured to generate one or more tags based on the sample of the input data, and tag the input data with the one or more tags in accordance with the prompt;
receiving the output comprising tagged input data which was tagged with one or more tags generated based on the sample of the input data and in accordance with the format, wherein the tagged input data includes a semantic meaning or semantic context of the input data;
storing the tagged input data in a database;
executing a query against the tagged input data stored in the database; and
returning a result of the query.