Is Multiplying Loss by 0.0 the Same as Not Having the Loss?

发表于 2023-03-03 更新于 2024-08-22 分类于深度学习文章标准暂无

Multiplying the loss with 0.0 will create zero gradients. However, your model could still "change". For example, since running stats are updated in each forward pass in batchnorm layers during training (i.e. if the model is in model.train() mode). Also, the optimizer could still update parameters if its using running stats (e.g. Adam) and if the parameters were already updated (i.e. if the running stats are already set) even if the gradient is set to zero.