| CPC G10L 15/18 (2013.01) [A63B 24/0006 (2013.01); A63B 24/0062 (2013.01); A63B 24/0075 (2013.01); A63B 24/0087 (2013.01); A63B 71/0622 (2013.01); G06N 20/00 (2019.01); G06V 40/23 (2022.01); G10L 15/22 (2013.01); A63B 2024/0009 (2013.01); A63B 2024/0015 (2013.01); A63B 2024/0071 (2013.01); A63B 2024/0096 (2013.01); A63B 2071/0627 (2013.01); A63B 2220/806 (2013.01); A63B 2225/12 (2013.01)] | 11 Claims |

|
1. A method for Artificial Intelligence (AI) assisted activity training, the method comprising:
presenting, by a rendering device, a plurality of activity categories to a user, wherein each of the plurality of activity categories comprises a plurality of activities, and wherein the plurality of activity categories are presented as multimedia content;
receiving a voice-based input and a secondary input from the user, wherein the voice-based input comprises an activity training plan comprising a selection of at least one activity from at least one of the plurality of activity categories and at least one activity attribute associated with each of the at least one activity, and wherein the voice-based input is in a source language and the secondary input comprises at least one of the air gestures, biometric inputs, game controllers, input via a keyboard, mouse or any other input;
processing, by a Natural Language Processing (NLP) model, the received voice-based input to extract the selection of at least one activity and the at least one activity attribute, wherein the NLP model is configured using a single language, and wherein the single language is an intermediate language;
initiating, contemporaneous to receiving the voice-based input and the secondary input, presentation of a multimedia content in conformance with the at least one activity and the at least one activity attribute, wherein the multimedia content comprises a plurality of guidance steps performed by a virtual assistant corresponding to the at least one activity;
detecting, via at least one camera, initiation of a user activity performance of the user in response to initiation of the multimedia content, wherein the user activity performance of the user at a given time comprises imitation of one of the at least one activity;
capturing, via the at least one camera, a video of the user activity performance of the user, wherein the at least one camera is placed at distributed locations;
overlaying, by a smart mirror configured with the at least one camera, a pose skeletal model corresponding to the user activity performance over a reflection of the user on the smart mirror, based on the user activity performance of the user, wherein the pose skeletal model comprises skeletal points overlaid on corresponding joints of the user;
processing, in near real time, by an Artificial Intelligence (AI) model, the video to extract a set of user performance parameters of the user based on the user activity performance,
wherein the AI model further comprises a set of portions of AI models configured to perform distinct functionalities for enhanced data security; and
wherein the set of user performance parameters comprises speed of a current activity performance, number of repetitions completed, overall completion of an activity circuit, third-party smart device information, pulse rate of the user, blood pressure of the user and motion of the user;
generating, by the AI model, a near real time feedback during the user activity performance, based on the overlaid pose skeletal model and the differential between the set of user performance parameters and a target set of performance parameters, wherein the target set of performance parameters comprises speed of the target activity performance, blood pressure, target number of repetitions, target pulse rate of the user and target motion of the user; and
rendering, contemporaneous to the user activity performance, the near real time feedback to the user in at least one of an aural form, a visual form, or as haptic feedback, wherein the near real time feedback in the visual form is provided via the post skeletal mode.
|