共计 3213 个字符,预计需要花费 9 分钟才能阅读完成。
1 报错形容
1.1 零碎环境
Environment(Ascend/GPU/CPU): GPU-GTX3090(24G)
Software Environment:
– MindSpore version (source or binary): 1.7.0
– Python version (e.g., Python 3.7.5): 3.8.13
– OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
– CUDA version : 11.0
1.2 根本信息
1.2.1 脚本
此代码是 ConvLSTM 从 PyTorch 迁徙到 MindSpore 的一部分,上面为报错局部
loss = train_network(data, label)
1.2.2 报错
局部个人信息做遮挡解决
[WARNING] ME(124028:139969934345984,MainProcess):2022-07-23-20:21:12.940.089 [mindspore/run_check/_check_version.py:140]
MindSpore version 1.7.0 and cuda version 11.0.221 does not match, please refer to the installation guide for version ma
tching information: https://www.mindspore.cn/install
[CRITICAL] ANALYZER(124028,7f4d4a374700,python):2022-07-23-20:21:21.559.937 [mindspore/ccsrc/frontend/operator/composite
/multitype_funcgraph.cc:160] GenerateFromTypes] The ‘sub’ operation does not support the type [kMetaTypeNone, Tensor[Flo
at32]].
The supported types of overload function sub
is: [Tensor, List], [Tensor, Tuple], [List, Tensor], [Tuple, Tensor], [Te
nsor, Number], [Number, Tensor], [Tensor, Tensor], [Number, Number].
Traceback (most recent call last):
File “main.py”, line 194, in <module>
train()
File “main.py”, line 142, in train
loss = train_network(data, label)
File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 586, in call
out = self.compile_and_run(*args)
File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 964, in compile_an
d_run
self.compile(*inputs)
File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py”, line 937, in compile
_cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
File “/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/common/api.py”, line 1006, in compil
e
result = self._graph_executor.compile(obj, args_list, phase, self._use_vm_mode())
RuntimeError: mindspore/ccsrc/frontend/operator/composite/multitype_funcgraph.cc:160 GenerateFromTypes] The ‘sub’ operat
ion does not support the type [kMetaTypeNone, Tensor[Float32]].
The supported types of overload function sub
is: [Tensor, List], [Tensor, Tuple], [List, Tensor], [Tuple, Tensor], [Te
nsor, Number], [Number, Tensor], [Tensor, Tensor], [Number, Number].
The function call stack (See file ‘/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/rank_0/om/analyze_fail.dat’ for more
details):
0 In file /home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py(373)
loss = self.network(*inputs)
^
1 In file /home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py(112)
return self._loss_fn(out, label)
^
2 In file /home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/loss/loss.py(313)
x = F.square(logits - labels)
2 起因剖析以及解决办法
起因直至 Mindspore 的 loss,一开始我也很纳闷,mindspore 的源代码我也不能批改,kMetaTypeNone 又是什么类型呢?起初参考这篇文章晓得了 Mindspore 分动动态图模式,默认如同是动态图模式,也就是所有的模型参数都要当时确定下来,不然不能构建动态图。
对于动态和动态图的区别,能够参考 mindspore 官网文档。具体而言,从我的角度就是动态图就是一开始倡议残缺个模型的计算图,这样子这“张”计算图就能够被反复利用了,不必每次都从新计算,进步计算速度,但这样不言而喻的毛病就是可扩展性差。
然而我的模型须要我依据输出进行调整,在对这个报错批改后很多其余中央如 MUL 操作,也接连呈现 kMetaTypeNone 的谬误,这样治标不治本,况且只有模型不改,问题就不可能被解决。
在看了 mindspore 官网文档后发现 mindspore 原来是反对动态图的呀!嗨,因为原框架 Pytorch 就是动态图的,因而只须要将 mindspore 调整成动态图就行了,具体操作是增加下方代码:
context.set_context(mode=context.PYNATIVE_MODE)
3 总结
多看 mindspore 官网文档,深刻理解框架原理及之间的区别,多利用社区。