US 12,008,026 B1
Determining repair instructions in response to natural language queries
Julio Bonis Sanz, Algete (ES); David Talby, Mercer Island, WA (US); and Veysel Kocaman, Echt (NL)
Assigned to John Snow Labs, Inc., Lewes, DE (US)
Filed by John Snow Labs, Inc., Lewes, DE (US)
Filed on Jan. 24, 2022, as Appl. No. 17/583,048.
Int. Cl. G06F 16/33 (2019.01); G06F 16/35 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)
CPC G06F 16/3344 (2019.01) [G06F 16/3347 (2019.01); G06F 16/355 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] 25 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
encoding, by one or more computing systems, information from multiple documents about how to perform a plurality of types of repairs for one or more indicated device types, wherein the multiple documents include repair manuals for the one or more indicated device types, and wherein the encoding includes:
separating, by the one or more computing systems and for each of the multiple documents, content from that document into multiple content groups that are each a subset of that content;
generating, by the one or more computing systems, a plurality of content embedding vectors to represent the information from the multiple documents, including generating, for each of the multiple content groups of each of the multiple documents, one of the plurality of content embedding vectors to represent semantic information of that content group;
generating, by the one or more computing systems, hashing information for the plurality of content embedding vectors, including generating a separate hash number for each of the plurality of content embedding vectors, and grouping the plurality of content embedding vectors into multiple buckets, wherein each bucket is associated with a group of multiple hash numbers and has a subset of multiple of the plurality of content embedding vectors whose hash numbers are in the associated group for the bucket; and
generating, by the one or more computing systems and for each of the multiple content groups, expanded content for that content group that includes that content group and additional information from the content of the document from which that content group was separated;
encoding, by the one or more computing systems, a received query about performing an indicated type of repair for an indicated device of one of the one or more indicated device types, wherein the received query is provided in a natural language format, and wherein the encoding of the received query includes generating a query embedding vector that represents semantic information of the received query;
determining, by the one or more computing systems, a response to the received query that provides instructions for performing the indicated type of repair for the indicated device, including:
matching, by the one or more computing systems, the query embedding vector to a group of multiple candidate content embedding vectors, including identifying one of the multiple buckets whose associated group of multiple hash numbers includes an additional hash number generated for the query embedding vector, and wherein the multiple candidate content embedding vectors are a subset of the plurality of content embedding vectors and include at least some of the multiple content embedding vectors of the identified one bucket that each has a matching distance to the query embedding vector below a defined threshold;
validating, by the one or more computing systems, and for each of one or more identified candidate content embedding vectors from the group of multiple candidate content embedding vectors, that the expanded content for the content group represented by that identified candidate content embedding vector provides the instructions for performing the indicated type of repair for the indicated device; and
generating, by the one or more computing systems, the instructions for performing the indicated type of repair for the indicated device from the expanded content for the content group represented by a selected one of the one or more identified candidate content embedding vectors; and
providing, by the one or more computing systems, the determined response to the received query, to initiate performing the indicated type of repair for the indicated device.