US 12,333,441 B2
Reinforcement learning method and apparatus using task decomposition
Min Jong Yoo, Suwon-si (KR); Gwang Pyo Yoo, Suwon-si (KR); and Hong Uk Woo, Suwon-si (KR)
Assigned to RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY, Suwon-si (KR)
Filed by RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY, Suwon-si (KR)
Filed on Dec. 6, 2022, as Appl. No. 18/075,669.
Claims priority of application No. 10-2021-0173678 (KR), filed on Dec. 7, 2021.
Prior Publication US 2023/0177348 A1, Jun. 8, 2023
Int. Cl. G06K 9/62 (2022.01); G06N 3/092 (2023.01); G06V 10/774 (2022.01)
CPC G06N 3/092 (2023.01) [G06V 10/774 (2022.01)] 12 Claims
OG exemplary drawing
 
7. A reinforcement learning apparatus using a task decomposition inference model in a time-variant environment, comprising:
a transition model unit which selects a plurality of paired transitions having a time-invariant common characteristic and a time-variant different environmental characteristic from the dataset including a plurality of transition data based on the cycle GAN;
an embedding unit which trains an auto encoder to embed each of the time-variant part and the time-invariant part with respect to the plurality of paired transitions into a latent space; and
a reinforcement learning unit which performs reinforcement learning on a transition corresponding to data collected in the time-variant environment, using the trained auto encoder.