US 12,001,840 B1
	Likelihood ratio test-based approach for detecting data entry errors
Arkadeep Banerjee, Bangalore (IN); and Vignesh T. Subrahmaniam, Bangalore (IN)
Assigned to Intuit, Inc., Mountain View, CA (US)
Filed by Intuit, Inc., Mountain View, CA (US)
Filed on Mar. 16, 2023, as Appl. No. 18/122,627.
Int. Cl. G06F 9/30 (2018.01); G06F 9/54 (2006.01)

CPC G06F 9/3001 (2013.01) [G06F 9/545 (2013.01)]

18 Claims

1. A computer-implemented method of detecting data errors, comprising:

receiving a new value as user input for a data field;

generating a first histogram-based approximation of a first kernel density estimate generated based on valid data associated with the data field and a second histogram-based approximation of a second kernel density estimate generated based on invalid data associated with the data field;

determining a first likelihood that the new value is valid, wherein the first likelihood is equal to a first probability density of a first bin of the first histogram-based approximation that includes a log ratio of the new value to a mean value associated with the data field;

determining a second likelihood that the new value is invalid, wherein the second likelihood is equal to a second probability density of a second bin of the second histogram-based approximation that includes the log ratio of the new value to the mean value associated with the data field;

computing a likelihood ratio test statistic based on a ratio of the first likelihood that the new value is valid to the second likelihood that the new value is invalid;

classifying the new value as valid or invalid based on comparing the likelihood ratio test statistic to a likelihood ratio test threshold; and

when the new value is classified as invalid, taking one or more actions to correct the new value.