US 12,147,915 B2
	Systems and methods for modelling prediction errors in path-learning of an autonomous learning agent
Sounak Dey, Kolkata (IN); Sakyajit Bhattacharya, Kolkata (IN); Kaustab Pal, Kolkata (IN); and Arijit Mukherjee, Kolkata (IN)
Assigned to TATA CONSULTANCY SERVICES LIMITED, Mumbai (IN)
Filed by Tata Consultancy Services Limited, Mumbai (IN)
Filed on Aug. 21, 2019, as Appl. No. 16/547,380.
Claims priority of application No. 201821031249 (IN), filed on Aug. 21, 2018.
Prior Publication US 2020/0151599 A1, May 14, 2020
Int. Cl. G06N 7/01 (2023.01); G06N 3/049 (2023.01); G06N 3/08 (2023.01)

CPC G06N 7/01 (2023.01) [G06N 3/049 (2013.01); G06N 3/08 (2013.01)]

3 Claims

3. A computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:

capture, by one or more hardware processors, a plurality of sequential actions depicting a pattern of time series via a two-stage modelling technique, wherein each of the plurality of sequential actions corresponds to the autonomous learning agent, wherein the autonomous learning agent is a robot that operates on an owner's behalf without interference of the owner, wherein the robot learns a square path, a crisscross path, and a right-angle path and wherein two-stage time series models of the two-stage modelling technique comprises:

(iii) time series model within each path-iteration of the path-learning of the autonomous learning agent represented as T₁; and

(iv) time series model across all path-iterations of the path-learning of the autonomous learning agent represented as T₂;

wherein the two-stage time series models of the two-stage modelling technique are represented as: e_ij=C+β₁[Σ_k=1^p1ϕ_k¹e_i−k,j+Σ_k=1^q1Ψ_i¹ε_i−k,¹.]+β₂[Σ_k=1^p2ϕ_k²e_i,j−k,+Σ_k=1^q2Ψ_i²ε_.,j−k²]+ε_ij, wherein ϕ¹and ϕ²denote estimated autoregressive parameters of the time series models T₁and T₂respectively and Ψ¹and Ψ²denote estimated moving-average parameters of T₁and T₂respectively and ε¹and ε²are errors attached to T₁and T₂respectively and ε is a Gaussian noise with mean 0 and variance 1;

and wherein, after a predefined number of path-iterations in the square path, the crisscross path, and the right-angle path, a value of a learning saturation point of the autonomous learning agent oscillates around a fixed value for a particular path, wherein the value of the learning saturation point corresponds to a value of predicted iteration for convergence so that even if the number of path-iterations are increased, the deviation converges around the fixed value for the particular path;

derive, based upon the plurality of sequential actions captured, one or more datasets comprising a plurality of predicted and actual actions of the autonomous learning agent by a Hierarchical Temporal Memory (HTM) modelling technique, wherein the plurality of sequential actions comprise plurality of parallel or sequential elementary actions and an each elementary action comprises a single or a plurality of parallel elementary operations characterized by a primitive senso-motoric operation for degree of freedom (DOF) of the robot and wherein the plurality of sequential actions comprises turning left, turning right, moving forward by the autonomous learning agent, wherein the plurality of sequential actions is represented by a sequence for execution of tasks or sub-tasks and depicts the pattern of time series as learning and prediction of the autonomous learning agent evolves and become accurate;

extract, using each of the plurality of predicted and actual actions, a set of prediction error values by a Euclidean Distance technique, wherein each of the set of prediction error values comprises a deviation from one or more actual actions amongst the plurality of predicted and actual actions, wherein actual actions refer to actual step taken by the autonomous learning agent and wherein each actual action amongst the plurality of predicted and actual actions results in a movement of the autonomous learning agent to a position on the path which is denoted by (x_act, y _act) and predicted action refers to an action of the autonomous learning agent predicted by the HTM modelling technique; and

model using the two-stage modelling technique implemented to capture a plurality of learning modalities across the path-learning of the autonomous learning agent, based upon the set of prediction error values, and a plurality of prediction errors in the path-learning of the autonomous learning agent, wherein the two-stage modelling is performed across and within each path-iteration of the path-learning by implementing an Autoregressive moving average (ARMA) technique, wherein as the path-iterations increase, the predicted actions modelled by the two-stage modelling technique converge with the predicted actions derived by the HTM, thereby facilitating in reducing errors and increasing accuracy in prediction of errors in the path-learning of the autonomous learning agent, and wherein the two-stage modelling comprises:

(iii) extracting, from the set of prediction error values, a plurality of fitted error values corresponding to each of the plurality of predicted actions and actual actions by implementing an Autoregressive moving average (ARMA) technique on the set of prediction error values; and

(iv) estimating, by implementing a linear regression technique on the plurality of fitted error values, a probable deviation of the autonomous learning agent from each of an actual action amongst the plurality of predicted and actual actions; and

wherein the two-stage time series models integrated via the linear regression technique for estimating the probable deviation of the autonomous learning agent and wherein the plurality of learning modalities comprise a learning from one or more preceding step steps within a path and a learning from one or more preceding iteration iterations across each path iteration by the autonomous learning agent,

wherein the learnt path is removed from the autonomous learning agent by the one or more hardware processors once the learnt path is tested as accurately learnt and each step of the learnt path is accurately predicted, so that the autonomous learning agent completely forgets navigations corresponding to that learnt path.