共计 2853 个字符,预计需要花费 8 分钟才能阅读完成。
1 报错形容
1.1 零碎环境
Hardware Environment(Ascend/GPU/CPU): Ascend
Software Environment:
– MindSpore version (source or binary): 1.8.0
– Python version (e.g., Python 3.7.5): 3.7.6
– OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic
– GCC/Compiler version (if compiled from source):
1.2 根本信息
1.2.1 脚本
训练脚本是通过构简略的算子网络,对输出两个张量做 Add 运算后再调用 Tensor
Summary。脚本如下:
01 class SummaryNet(nn.Cell):
02 def __init__(self,):
03 super(SummaryNet, self).__init__()
04 self.summary = ops.TensorSummary()
05 self.add = ops.Add()
06
07 def construct(self, x, y):
08 x = self.add(x, y)
09 name = “x”
10 self.summary(name, x.sum())
11 return x
12
13 x = Tensor(np.array([1, 2, 3]).astype(np.float32))
14 y = Tensor(np.array([4, 5, 6]).astype(np.float32))
15 summary_net = SummaryNet()(x, y)
16 print(“out: “, summary_net)
1.2.2 报错
这里报错信息如下:
Traceback (most recent call last):
File “C:/Users/l30026544/PycharmProjects/q2_map/new/173735.py”, line 22, in <module>
summary_net = SummaryNet()(x, y)
File “C:\Users\l30026544\PycharmProjects\q2_map\lib\site-packages\mindspore\nn\cell.py”, line 586, in call
out = self.compile_and_run(*args)
File “C:\Users\l30026544\PycharmProjects\q2_map\lib\site-packages\mindspore\nn\cell.py”, line 964, in compile_and_run
self.compile(*inputs)
File “C:\Users\l30026544\PycharmProjects\q2_map\lib\site-packages\mindspore\nn\cell.py”, line 937, in compile
_cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
File “C:\Users\l30026544\PycharmProjects\q2_map\lib\site-packages\mindspore\common\api.py”, line 1006, in compile
result = self._graph_executor.compile(obj, args_list, phase, self._use_vm_mode())
ValueError: mindspore\core\utils\check_convert_utils.cc:397 CheckInteger] For primitive[TensorSummary], the v rank must be greater than or equal to 1, but got 0.
WARNING: Logging before InitGoogleLogging() is written to STDERR
[CRITICAL] CORE(6472,1,?):2022-6-17 15:47:53 [mindspore\core\utils\check_convert_utils.cc:397] CheckInteger] For primitive[TensorSummary], the v rank must be greater than or equal to 1, but got 0.
起因剖析
咱们看报错信息,在 ValueError 中,写到 ValueError: For primitive[TensorSummary], the v rank must be greater than or equal to 1, but got 0.
,意思是对于 TensorSummary,参数 v 的秩必须大于等于 1,然而失去了 0. 因而须要检查一下擦传入 TensorSummary 的 v 的秩是不是符合要求的。查看脚本的第 8 行发现对 x 和 y 进行了求和操作,后果是一个 scalar(常数),因而报错。对于 TensorSummary,在官网做了输出限度,对输出的 Tensor 要求 rank 必须大于等于 1。如果须要对标量数据进行收集,能够应用 ScalarSummary 算子。
2 解决办法
基于下面已知的起因,很容易做出如下批改:
01 class SummaryNet(nn.Cell):
02 def __init__(self,):
03 super(SummaryNet, self).__init__()
04 self.summary = ops.ScalarSummary()
05 self.add = ops.Add()
06
07 def construct(self, x, y):
08 x = self.add(x, y)
09 name = “x”
10 self.summary(name, x.sum())
11 return x
12
13 x = Tensor(np.array([1, 2, 3]).astype(np.float32))
14 y = Tensor(np.array([4, 5, 6]).astype(np.float32))
15 summary_net = SummaryNet()(x, y)
16 print(“out: “, summary_net)
此时执行胜利,输入如下:
out: [5. 7. 9.]
3 总结
定位报错问题的步骤:
1、找到报错的用户代码行: summary_net = SummaryNet()(x, y);
2、依据日志报错信息中的关键字,放大剖析问题的范畴 For primitive[TensorSummary], the v rank must be greater than or equal to 1, but got 0. ;
3、须要重点关注变量定义、初始化的正确性。
4 参考文档
4.1 TensorSummary 算子 API 接口