US 11,960,515 B1
	Edge computing units for operating conversational tools at local sites
Venkata Bhanu Teja Pallakonda, Bryan, TX (US); and Pragyana K. Mishra, Seattle, WA (US)
Assigned to Armada Systems, Inc., San Francisco, CA (US)
Filed by Armada Systems, Inc., San Francisco, CA (US)
Filed on Oct. 6, 2023, as Appl. No. 18/482,839.
Int. Cl. G06F 16/332 (2019.01); G06F 16/33 (2019.01); G06F 40/30 (2020.01)

CPC G06F 16/3329 (2019.01) [G06F 16/3344 (2019.01); G06F 40/30 (2020.01)]

20 Claims

1. An edge computing unit comprising a containerized system having:

at least one processor unit;

at least one server rack;

at least one power unit;

at least one environmental control system; and

at least one isolation system,

wherein the at least one server rack comprises at least one data store programmed with:

one or more sets of instructions;

code for executing a first model, wherein the first model is a model configured to generate an embedding representing a semantic descriptor of a document in a vector space having a predetermined number of dimensions;

code for executing a second model, wherein the second model is a conversational model configured to generate a response to a query in reply to a prompt comprising information regarding the query; and

a knowledge base comprising a first plurality of knowledge documents and a first plurality of knowledge embeddings, wherein each of the first plurality of knowledge documents comprises a question pertaining to a domain and an answer to the question, and wherein each of the first plurality of knowledge embeddings was generated by the first model based at least in part on one of the first plurality of knowledge documents,

wherein the one or more sets of instructions, when executed by the at least one processor unit, cause the edge computing unit to at least:

receive a first set of text representing a first query from a computer system in communication with the edge computing unit;

provide at least a portion of the first set of text as a first input to the first model;

generate a first embedding based at least in part on a first output received in response to the first input;

identify a second plurality of embeddings, wherein each of the second plurality of embeddings is one of a predetermined number of the first plurality of embeddings most similar to the first embedding;

identify a second plurality of documents, wherein each of the second plurality of documents is one of the first plurality of documents from which one of the second plurality of embeddings was generated;

generate a first prompt based at least in part on the portion of the first set of text and the second plurality of documents;

provide the first prompt as a second input to the second model;

identify a first response to the first query based at least in part on a second output received in response to the second input; and

transmit a second set of text representing the first response to the computer system.