US 12,141,531 B2
Semantic classification of numerical data in natural language context based on machine learning
Bin Shen, Princeton, NJ (US)
Assigned to Siuvo Inc., Princeton, NJ (US)
Filed by Siuvo Inc., Princeton, NJ (US)
Filed on Oct. 3, 2022, as Appl. No. 17/937,705.
Application 17/937,705 is a continuation of application No. 16/633,863, granted, now 11,461,554, previously published as PCT/US2018/043804, filed on Jul. 26, 2018.
Claims priority of provisional application 62/537,369, filed on Jul. 26, 2017.
Prior Publication US 2023/0101445 A1, Mar. 30, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/30 (2020.01); G06F 40/289 (2020.01); G06N 3/045 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2023.01); G16H 10/60 (2018.01); G16H 20/00 (2018.01); G16H 50/20 (2018.01)
CPC G06F 40/30 (2020.01) [G06F 40/289 (2020.01); G06N 3/045 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G16H 10/60 (2018.01); G16H 20/00 (2018.01); G16H 50/20 (2018.01)] 15 Claims
OG exemplary drawing
 
1. A method for processing numerical data within a natural language context, the method comprising:
detecting in a natural language text segment the presence of numerical data comprising one or more numbers;
extracting the numbers detected and words surrounding the numbers, the words being within a window of a predetermined length;
creating a word vector for each of the extracted words;
determining the most correlated feature of the extracted words by inputting the word vector for each of the extracted words into a first machine learning module, wherein the first machine learning module comprises a convolutional neural network;
associating the most correlated feature of the extracted words with the numbers; and
classifying the natural language text segment by inputting the numbers and the associated most correlated feature into a second machine learning module, wherein the second machine learning model comprises a feedforward neural network, and classifying the natural language text segment comprises creating a feature vector for the most correlated feature of the extract words and inputting the feature vector into the second machine learning module.