| CPC G06F 11/184 (2013.01) [G06F 11/2033 (2013.01)] | 20 Claims |

|
1. A processing system, comprising:
three or more central processing unit (CPU)-graphical processing unit (GPU) pairs, the CPU of each of the CPU-GPU pairs configured to run kernels for programs executing on a corresponding GPU of the CPU-GPU pair;
a backup CPU; and
a lockstep controller connected to the three or more CPU-GPU pairs and to the backup CPU, the lockstep controller configured to:
operate the three or more CPU-GPU pairs in parallel to execute programs in a lockstep manner, the CPU of each CPU-GPU pairs running kernels for the programs in parallel;
compare an output from each CPU of the three or more CPU-GPU pairs for each of one or more kernels running on the CPUs of the three or more CPU-GPU pairs;
based upon comparing the outputs, determine whether any of the CPU-GPU pairs are defective; and
in response to determining that a first of the CPU-GPU pairs is defective:
discontinue the operation of the first GPU-CPU pair in parallel to execute programs in a lockstep manner with others of the three or more CPU-GPU pairs; and
operate the others of the three or more CPU-GPU pairs and the backup CPU and the GPU of a second of the CPU-GPU pairs in parallel to execute programs in a lockstep manner, the CPU of the second CPU-GPU pair operating as a CPU-GPU pair with the GPU of the second CPU-GPU pair and the backup CPU operating as a CPU-GPU pair with the GPU of the second CPU-GPU pair, the backup CPU and the CPUs of the others of the CPU-GPU pairs running kernels for the programs in parallel.
|