US 12,299,007 B2
	Automatically drawing infographics for statistical data based on a data model
Ye Fan, Xi'an (CN); Qi Mao, Xi'an (CN); Juan Wu, Xi'an (CN); Jia Zhong Wu, Xi'an (CN); Long Fan, Xi'an (CN); Chong Liu, Xi'an (CN); Wen Pei Yu, Xi'an (CN); and Yang Yang, Xi'an (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jan. 21, 2022, as Appl. No. 17/581,646.
Prior Publication US 2023/0237076 A1, Jul. 27, 2023
Int. Cl. G06F 16/28 (2019.01); G06T 11/20 (2006.01)

CPC G06F 16/287 (2019.01) [G06T 11/206 (2013.01)]

20 Claims

1. A computer-implemented method for automatically drawing infographics, the method comprising:

populating a database with variables and associated data models based on previous assignments of levels of measurements and roles of variables;

receiving selected variables from a dataset to be utilized for generating infographics from a computing device by an infographics generator;

utilizing said data models stored in said database connected to said infographics generator for classifications of variables used in datasets, wherein said classifications of variables used in datasets comprise a type of variables, a role of variables, and a level of measurement of variables;

identifying an appropriate data model stored in said database based on matching a name of a variable selected by a user with a name of a variable listed in said data model using natural language processing;

parsing, by said infographics generator, said dataset to obtain metadata;

automatically implementing a procedure to draw infographics, by said infographics generator, for variables not assigned a role of a target using said metadata and said data model associated with each of said variables not assigned said role of said target in response to said variables not being assigned said role of said target;

automatically implementing a procedure to draw infographics, by said infographics generator, for variables assigned said role of said target using said metadata and said data model associated with each of said variables assigned said role of said target in response to said variables being assigned said role of said target;

determining if there is a value of a continuous variable that exceeds a first threshold value; and

performing a pair-group search strategy to reduce a complexity of correlation analysis and in selecting appropriate continuous variables to be drawn thereby improving clarity and understandability of infographics in response to said value of said continuous variable exceeding said first threshold value, wherein said pair-group search strategy comprises:

grouping continuous variables with a value that exceeds said first threshold value;

computing a correlation rate for each pair of continuous variables by computing a Euclidean distance or a cosine distance between said pair of continuous variables, wherein said correlation rate measures show strong a relationship is between two variables;

identifying a pair of continuous variables as belonging to a same cluster in response to said computed correlation rate exceeding said first threshold value; and

identifying a pair of continuous variables as not belonging to said same cluster in response to said computed correlation rate not exceeding said first threshold value.