US 11,907,678 B2
Context-aware machine language identification
Fan Wang, Suzhou (CN); Li Cao, Beijing (CN); Rui Wang, Xian (CN); and Lei Gao, Xian (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Nov. 10, 2020, as Appl. No. 17/093,879.
Prior Publication US 2022/0147720 A1, May 12, 2022
Int. Cl. G06F 40/45 (2020.01); G06F 40/51 (2020.01); G10L 15/26 (2006.01); G06F 40/30 (2020.01); G06F 40/284 (2020.01); G06F 40/58 (2020.01)
CPC G06F 40/58 (2020.01) [G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06F 40/45 (2020.01); G06F 40/51 (2020.01); G10L 15/26 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A machine translation system, comprising:
a ChatOps system, the ChatOps system adapted to receive a natural language input text from an end user and to generate one or more executable commands from the natural language input text, the ChatOps system comprising:
a density calculator, the density calculator adapted to:
tokenize the natural language input text into a plurality of word tokens;
calculate a part of speech (POS) density for the plurality of word tokens in the natural language input text;
calculate a knowledge density for the plurality of word tokens in the natural language input text;
level the calculated knowledge densities by POS; and
calculate an information density for the plurality of word tokens in the natural language input text using the POS density and the leveled knowledge density;
a sememe attacher adapted to generate one or more corresponding sememes for one or more of the plurality of word tokens using their respective information densities; and
a context translator adapted to translate the natural language input text into the one or more executable commands using the corresponding sememes, wherein the context translator is adapted to:
divide the natural language input text into a plurality of smaller chunks by stop words, the plurality of smaller chunks including one or more word tokens having one or more corresponding sememes attached thereto;
generate a semantic context for one or more of the smaller chunks using the one or more corresponding sememes, including:
cluster the word tokens using the corresponding sememes;
determine a sense for the clustered word tokens;
merge the sense with the corresponding sememes to generate a semantic context for one of the chunks; and
using the semantic context to translate the chunk; and
translate the natural language input text into the one or more executable commands using the semantic context.