CPC G06F 16/3331 (2019.01) [G06F 16/353 (2019.01); G16B 45/00 (2019.02)] | 16 Claims |
1. A method of generating a structured format document comprising the steps of:
generating at least one new coded form of a chemical or biological entity based on combinations of the identified, common and non-common features of one or more biological or chemical identifiers mapped to a virtual n-dimensional array;
wherein generating the at least one new coded form of a chemical or biological entity, further comprises, using at least one processor,
submitting, in electronic form, a search to at least one document database for documents describing the subject matter;
extrapolating, to a first array within the memory of the computer, at least one biologic or chemical identifier described in at least one document returned from the search;
transforming each biologic identifier in the first array into a respective coded form having a range of values;
populating the respective coded forms into a second array within a memory accessible to the at least one processor;
generating, using the at least one processor, a virtual n-dimensional array of nodes configured to encompass the range of values in the second array, each node of the virtual n-dimensional array having an associated weight vector value based on the range of values in the second array;
placing each coded form in the second array into a node of the virtual n-dimensional array according to an unsupervised learning algorithm to effect a placement;
outputting at least one chemical or biological identifier corresponding to the new coded form; and
generating a structured text document that includes the chemical or biological identifier.
|