US 12,346,115 B2
	Method and system for modelling and control partially measurable systems
Diego Romeres, Boston, MA (US); Fabio Amadio, Padua (IT); Alberto Dalla Libera, Padua (IT); Riccardo Antonello, Padua (IT); Ruggero Carli, Padua (IT); and Daniel Nikovski, Brookline, MA (US)
Assigned to Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed by Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed on Dec. 4, 2020, as Appl. No. 17/112,531.
Prior Publication US 2022/0179419 A1, Jun. 9, 2022
Int. Cl. G05D 1/00 (2024.01); B60W 10/08 (2006.01); B60W 10/22 (2006.01); B60W 30/00 (2006.01); G05B 13/02 (2006.01)

CPC G05D 1/021 (2013.01) [B60W 10/22 (2013.01); B60W 30/00 (2013.01); G05B 13/0265 (2013.01); B60W 10/08 (2013.01)]

13 Claims

1. A controller for controlling motion of a vehicle system, comprising:

an interface operatively coupled to the vehicle system, wherein the interface is configured to acquire a motion state of the vehicle system and a measurement state via one or more motion sensors measuring the vehicle system;

a memory configured to store computer-executable program modules including a first model learning module and a policy learning module;

a processor configured to perform steps of the computer-executable program modules, the steps include:

offline-modeling to generate offline states based on the motion state of the vehicle system and the measurement state using a model learning program, wherein the model learning program is configured to consider at least one of a squared exponential (SE) kernel and a multiplicative polynomial (MP) kernel or a semi-parametrical (SP) kernel to model the evolution of the vehicle system,

wherein the SE kernel and the MP kernel use the Gaussian Process Regression (GPR) with Gaussian Processes (GP) input for model learning, wherein the GP input includes the motion state and an input of the vehicle system measured by the one or more motion sensors,

wherein the first model learning module includes an offline estimator and a second model learning module, wherein the offline state estimator estimates and provides the offline states to the second model learning module,

wherein the policy learning module generates Monte Carlo (MC) based particles, wherein the MC based particles are obtained by sampling from a probability distribution using a Monte Carlo approach to estimate expected cumulative cost from particle trajectories propagated through a learned model, and includes a model of an online estimator configured to generate particle online estimates based on particle measurements and prior particle online estimates, wherein the particle online estimates correspond to an approximation of the particle measurements;

providing the offline states to the policy learning module to generate policy parameters used in the MC based particles; and

updating the policy of the vehicle system to operate the vehicle system based on the generated policy parameters.