CPC G06N 3/08 (2013.01) [G06F 3/04847 (2013.01)] | 20 Claims |
1. A method of compressing a neural network model that is performed by a computing device, comprising:
receiving, at a processor of the computing device, a trained model and compression method instructions for compressing the trained model;
identifying, via the processor, a compressible block and a non-compressible block among a plurality of blocks included in the trained model based on the compression method instructions;
transmitting, via a computer network, a command to a user device that causes the user device to:
display a structure of the trained model representing a connection relationship between the plurality of blocks on a first screen such that the compressible block and the non-compressible block are visually distinguished, and
display, on a second screen, an input field operable to receive a parameter value entered by a user for compression of the compressible block; and
compressing the trained model based on the parameter value entered by the user in the input field.
|