US 12,437,185 B2
	Apparatus and method for artificial intelligence neural network based on co-evolving neural ordinary differential equations
No Seong Park, Seoul (KR); Sheo Yon Jhin, Goyang-si (KR); Min Ju Jo, Seoul (KR); Tae Yong Kong, Seoul (KR); and Jin Sung Jeon, Seoul (KR)
Assigned to UIF (UNIVERSITY INDUSTRY FOUNDATION), YONSEI UNIVERSITY, Seoul (KR)
Filed by UIF (University Industry Foundation), Yonsei University, Seoul (KR)
Filed on Dec. 29, 2021, as Appl. No. 17/564,912.
Claims priority of application No. 10-2021-0181699 (KR), filed on Dec. 17, 2021.
Prior Publication US 2023/0196071 A1, Jun. 22, 2023
Int. Cl. G06N 3/045 (2023.01); G06F 17/13 (2006.01); G06N 3/048 (2023.01)

CPC G06N 3/045 (2023.01) [G06F 17/13 (2013.01); G06N 3/048 (2023.01)]

12 Claims

1. An apparatus for an artificial intelligence neural network based on co-evolving neural ordinary differential equations (NODEs), the apparatus comprising:

at least one processor and memory storing instructions performed by the at least one processor;

a main NODE module configured to provide a downstream machine learning task at an initial time; and

an attention NODE module configured to receive the downstream machine learning task from the main NODE module and provide attention to the main NODE module based on the downstream machine learning task,

wherein the main NODE module receives the attention provided from the attention NODE module and provides the downstream machine learning task at a next time after the initial time such that the main NODE module and the attention NODE module influence each other over time so that the main NODE module outputs a multivariate time-series value at a given time for an input sample x,

wherein the attention is used in a feature extraction layer before a NODE layer and does not introduce a new NODE model that is internally combined with the attention,

wherein Explainable Tensorized Neural (ETN)-ODE uses the attention to derive a correlation matrix in the feature extraction layer and then evolve the correlation matrix using the NODE layer, and

wherein the main NODE module and the attention NODE module are each implemented via the at least one processor.