关于机器学习:MindSpore-xshape-and-yshape-are-supposed-to-broadcast

2次阅读

共计 5526 个字符,预计需要花费 14 分钟才能阅读完成。

1 报错形容
1.1 零碎环境
Environment(Ascend/GPU/CPU): GPU-GTX3090(24G)
Software Environment:
– MindSpore version (source or binary): 1.7.0
– Python version (e.g., Python 3.7.5): 3.8.13
– OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
– CUDA version : 11.0

1.2 根本信息
1.2.1 脚本
此代码是 ConvLSTM 从 PyTorch 迁徙到 MindSpore 的一部分,上面为报错局部

split = ops.Split(1, 2)
output = split(x)
1.2.2 报错
局部个人信息做遮挡解决

Traceback (most recent call last):
File “main.py”, line 195, in <module>

train()

File “main.py”, line 142, in train

loss = train_network(data, label)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 612, in call

raise err

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 609, in call

output = self._run_construct(cast_inputs, kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 429, in _run_const
ruct

output = self.construct(*cast_inputs, **kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py”, line 373,
in construct

loss = self.network(*inputs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 612, in call

raise err

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 609, in call

output = self._run_construct(cast_inputs, kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 429, in _run_const
ruct

output = self.construct(*cast_inputs, **kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py”, line 111,
in construct

out = self._backbone(data)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 612, in call

raise err

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 609, in call

output = self._run_construct(cast_inputs, kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 429, in _run_const
ruct

output = self.construct(*cast_inputs, **kwargs)

File “/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/model.py”, line 31, in construct

state = self.encoder(input)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 612, in call

raise err

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 609, in call

output = self._run_construct(cast_inputs, kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 429, in _run_const
ruct

output = self.construct(*cast_inputs, **kwargs)

File “/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/encoder.py”, line 42, in construct

inputs, state_stage = self.forward_by_stage(

File “/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/encoder.py”, line 35, in forward_by_stage

outputs_stage, state_stage = rnn(inputs, None)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 612, in call

raise err

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 609, in call

output = self._run_construct(cast_inputs, kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 429, in _run_const
ruct

output = self.construct(*cast_inputs, **kwargs)

File “/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/ConvRNN.py”, line 61, in construct

combined_2 = P.Concat(1)((x, r * htprev))  # h' = tanh(W*(x+r*H_t-1))

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/common/tensor.py”, line 278, in __mu
l__

return tensor_operator_registry.get('__mul__')(self, other)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/composite/multitype_ops/_compile
_utils.py”, lin
e 101, in _tensor_mul

return F.mul(self, other)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py”, line 294, in __ca
ll__

return _run_op(self, self.name, args)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/common/api.py”, line 90, in wrapper

results = fn(*arg, **kwargs)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py”, line 754, in _run
_op

output = real_run_op(obj, op_name, args)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py”, line 575, in __in
fer__

out[track] = fn(*(x[track] for x in args))

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/operations/math_ops.py”, line 78
, in infer_shap
e

return get_broadcast_shape(x_shape, y_shape, self.name)

File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/_utils/utils.py”, line 70, in ge
t_broadcast_sha
pe

raise ValueError(f"For'{prim_name}', {arg_name1}.shape and {arg_name2}.shape are supposed"

ValueError: For ‘Mul’, x.shape and y.shape are supposed to broadcast, where broadcast means that x.shape[i] = 1 or -1 or
y.shape[i] = 1
or -1 or x.shape[i] = y.shape[i], but now x.shape and y.shape can not broadcast, got i: -3, x.shape: [16, 2, 64, 64], y
.shape: [16, 64
, 64, 64].
2 起因剖析以及解决办法
过后真的无比纳闷,为什么 split 进去的维度不是本人想要的呢?还认为输出的维度就错了,从输出开始 debug,后果发现后面都没问题,是 split 出问题了。

一开始我是通过 pytorch-mindspore 的对照表进行算子映射的。其中 torch.split 与 mindspore.ops.Split 相映射,且备注没有额定信息,我天然就认为他们的参数是一样的。但其实不然,翻阅 pytorch 和 mindspore 文档就能够晓得 torch.split 中除了 tensor 和 dim 的参数是

split_size_or_sections (int) or (list(int)) – size of a single chunk or list of sizes for each chunk
而 mindspore 中除了 tensor 和 dim 的参数是

output_num (int) – 指定宰割数量。其值为正整数。默认值:1。

相对而言,mindspore 的参数更好操作和了解,而 pytorch 还须要本人额定计算,所以在迁徙时不能单纯把参数复制过去,还要看是否能绝对应上。

3 总结
迁徙时要勤翻 pytorch 和 mindspore api 的文档,除了利用 mindconvert 进行主动映射外,还须要留神一下不反对算子的映射。

正文完
 0