Friday, July 6, 2018

A tricky error regarding multiple-GPU training

A must undergoing step for utilizing multiple GPU to train model is to average gradients computed by different GPUs. A typical error could happen when the gradient is partial available or stop_gradient is called into the graph. The error message is like this:

ValueError: Tried to convert 'input' to a tensor and failed. Error: None values not supported.

If it happens, try to explicitly disable trainable property of the variables.

No comments:

Post a Comment