| CPC H04N 23/66 (2023.01) [G06F 3/167 (2013.01); G06Q 20/123 (2013.01); G06Q 20/322 (2013.01); G06Q 20/3224 (2013.01); G06Q 20/40145 (2013.01); G06T 1/20 (2013.01); G06T 11/00 (2013.01); G06V 20/20 (2022.01); G06V 20/35 (2022.01); G11B 27/031 (2013.01); H04L 65/61 (2022.05); H04L 65/764 (2022.05); H04L 67/53 (2022.05); H04M 1/6041 (2013.01); H04M 1/72454 (2021.01); H04N 5/28 (2013.01); H04N 5/77 (2013.01); H04N 9/8205 (2013.01); H04N 23/661 (2023.01); G06N 5/022 (2013.01); G06N 20/00 (2019.01); H04M 2250/52 (2013.01); H04M 2250/74 (2013.01)] | 5 Claims |

|
1. A method comprising:
receiving, by one or more processors of a cloud computing platform, context data from a wearable multimedia device, the wearable multimedia device including at least one data capture device for capturing the context data, the context data including at least one of image data or depth data and speech;
converting the speech into speech text;
determining, by the one or more processors, two or more applications based on machine learning—and the context data;
creating, by the one or more processors, a data processing pipeline with the two or more applications, wherein a first application in the data processing pipeline is configured to process at least one of the image data or depth data, and a second application in the data processing pipeline is configured to process the speech text separate from the image or depth data;
processing, by the one or more processors, the context data through the data processing pipeline, the processing including using at least one of the image data or depth data and the speech text to label one or more objects captured in the image data or depth data; and
sending, by the one or more processors, output of the data processing pipeline to the wearable multimedia device or other device.
|