see https://github.com/Soarkey/tr...

1.环境 / Environment

  • Ubuntu 20
  • Python 3.7
  • CUDA 11.4
  • PyTorch 1.9.0+cu111

2.批改 / Modification

找到lib/models/dcn/src/deform_conv.py文件, 将所有的AT_CHECK替换为TORCH_CHECK, 同时将所有的.view操作替换为.reshape
操作, 之后从新执行编译python setup.py develop.

Found file lib/models/dcn/src/deform_conv.py, replace all AT_CHECK and .view operations with TORCH_CHECK and .reshape, then recompile python setup.py develop.

3.可能碰到的问题及解决 / References

> ValueError: Unknown CUDA arch (8.6) or GPU not supported

  • 解决 / Solution

    将conda环境所在文件夹中的cpp_extension.py内容从:

    named_arches = collections.OrderedDict([    ('Kepler+Tesla', '3.7'),    ('Kepler', '3.5+PTX'),    ('Maxwell+Tegra', '5.3'),    ('Maxwell', '5.0;5.2+PTX'),    ('Pascal', '6.0;6.1+PTX'),    ('Volta', '7.0+PTX'),    ('Turing', '7.5+PTX'),])supported_arches = ['3.5', '3.7', '5.0', '5.2', '5.3', '6.0', '6.1', '6.2',                    '7.0', '7.2', '7.5']

    改为

    named_arches = collections.OrderedDict([    ('Kepler+Tesla', '3.7'),    ('Kepler', '3.5+PTX'),    ('Maxwell+Tegra', '5.3'),    ('Maxwell', '5.0;5.2+PTX'),    ('Pascal', '6.0;6.1+PTX'),    ('Volta', '7.0+PTX'),    ('Turing', '7.5+PTX'),    ('Ampere', '8.0;8.6+PTX'),])supported_arches = ['3.5', '3.7', '5.0', '5.2', '5.3', '6.0', '6.1', '6.2',                    '7.0', '7.2', '7.5', '8.0', '8.6']

    区别在于:减少了8.6的反对, 3090就是属于sm86架构.

  • see solution: https://blog.csdn.net/ng323/a...)

> undefined symbol: THPVariableClass

  • 起因: 在导入某些和pytorch无关的第三方包时,如果先导入第三方包,容易产生这种谬误,正确的做法是首先导入pytorch。
  • see solution: https://blog.csdn.net/slow122...