US 12,443,272 B2
	Proactive actions based on audio and body movement
Brian W. Temple, Santa Clara, CA (US); Devin W. Chalmers, Oakland, CA (US); and Thomas G. Salter, Foster City, CA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on Mar. 8, 2022, as Appl. No. 17/689,460.
Claims priority of provisional application 63/159,503, filed on Mar. 11, 2021.
Prior Publication US 2022/0291743 A1, Sep. 15, 2022
Int. Cl. G06F 3/01 (2006.01); G06F 3/0481 (2022.01); G06F 3/16 (2006.01); G06V 40/16 (2022.01); G06V 40/20 (2022.01); G10L 25/51 (2013.01); G10L 25/78 (2013.01)

CPC G06F 3/012 (2013.01) [G06F 3/013 (2013.01); G06F 3/0481 (2013.01); G06F 3/16 (2013.01); G06V 40/174 (2022.01); G06V 40/20 (2022.01); G10L 25/51 (2013.01); G10L 25/78 (2013.01); G10H 2210/076 (2013.01)]

18 Claims

1. A method comprising,

at an electronic device having a processor:

obtaining first sensor data and second sensor data corresponding to a physical environment, the first sensor data corresponding to audio in the physical environment and the second sensor data corresponding to a body movement in the physical environment;

selectively switching to different power states on the electronic device based on multiple triggers identified at the electronic device, where selectively switching through the different power states comprises:

detecting, based on the second sensor data, that the body movement corresponds to a type of body movement indicative of user interest in music;

based on detecting that the body movement corresponds to the type of body movement indicative of user interest in music, selectively triggering performance of audio analysis to determine if music is playing based on the first sensor data;

determining that music is playing based on the audio analysis;

based on determining that music is playing, selectively triggering performance of a comparison of audio elements with one or more aspects of the body movement, based on the first sensor data and the second sensor data;

identifying a time-based relationship between one or more elements of the audio and one or more aspects of the body movement via at least the comparison;

based on identifying the time-based relationship, selectively triggering performance of a source attribute identification process; and

identifying an interest in content of the audio based at least on the source attribute identification process;

determining to wait to provide one or more features based on a user state corresponding to the user being busy; and

providing one or more features based on identifying the interest in the content, wherein a timing of providing the one or more features comprises waiting based on the user state corresponding to the user being busy.