US 11,990,207 B2
Method of identification, non-transitory computer readable recording medium, and identification apparatus
Masahiro Kataoka, Kamakura (JP); Kota Natsume, Ota (JP); and Satoshi Kitadate, Chiba (JP)
Assigned to FUJITSU LIMITED, Kawasaki (JP)
Filed by FUJITSU LIMITED, Kawasaki (JP)
Filed on Jan. 28, 2020, as Appl. No. 16/774,071.
Claims priority of application No. 2019-036298 (JP), filed on Feb. 28, 2019.
Prior Publication US 2020/0279615 A1, Sep. 3, 2020
Int. Cl. G16B 30/00 (2019.01); G06F 16/13 (2019.01); G06F 16/174 (2019.01); G16B 15/00 (2019.01); G16B 50/50 (2019.01)
CPC G16B 30/00 (2019.02) [G06F 16/13 (2019.01); G06F 16/1744 (2019.01); G16B 15/00 (2019.02); G16B 50/50 (2019.02)] 8 Claims
OG exemplary drawing
 
1. A method of identification comprising:
acquiring a protein file in which a plurality of proteins including a plurality of amino acids are arranged, using a processor;
first identifying a plurality of primary structure candidates with any position included in the protein file as a starting position, and identifying an end of each of the primary structure candidates based on a primary structure dictionary index that indicates a position of a primary structure included in the protein file, using the processor;
second identifying one primary structure among the primary structure candidates based on a combination of a primary structure and each amino acid and a primary structure table, where each amino acid is positioned at the identified end of each of the primary structure and the primary structure table associates a primary structure and a cooccurrence rate of a certain amino acid combination positioned at an end of the primary structure, wherein the second identified primary structure has the highest co-occurrence rate among the primary structure candidates, using the processor;
generating a primary structure compression file by compressing the protein file in units of primary structures based on the primary structure identified by the second identifying that is repeatedly performed and a primary structure dictionary associating a primary structure and a code with each other, the generated primary structure compression file including information in which a plurality of primary structure codes are arranged, using the processor; and
generating a primary structure transposition index associating a primary structure type and a corresponding offset position in a sequence in the primary structure compression file with each other, using the processor.