关于人工智能:CAM-GradCAM-GradCAM可视化CNN方式的代码实现和对比

48次阅读

共计 4773 个字符，预计需要花费 12 分钟才能阅读完成。

当应用神经网络时，咱们能够通过它的准确性来评估模型的性能，然而当波及到计算机视觉问题时，不仅要有最好的准确性，还要有可解释性和对哪些特色 / 数据点有助于做出决策的了解。模型专一于正确的特色比模型的准确性更重要。

了解 CNN 的办法次要有类激活图 (Class Activation Maps, CAM)、梯度加权类激活图(Gradient Weighted Class Activation Mapping, Grad-CAM) 和优化的 Grad-CAM（Grad-CAM++）。它们的思维都是一样的：如果咱们取最初一个卷积层的输入特色映射并对它们施加权重，就能够失去一个热图，能够表明输出图像中哪些局部的权重高（代表了整个图的特色）。

CAM 是一种将 CNN 所看到或关注的内容可视化并为咱们生成类输入的办法。

通过将图像传递给 CNN，咱们取得了雷同图像的低分辨率特色图。

CAM 的思维是，删除那些齐全连贯的神经网络，并用全局均匀池化层代替它们，特色图中所有像素的平均值就是它的全局平均值。通过将 GAP 利用于所有特色映射将取得它们的标量值。

对于这些标量值，咱们利用表明每个特色映射对特定类重要性的权重，权重是通过训练一个线性模型来学习的。

激活图将是所有这些特色图的加权组合。

 defgenerate_cam(input_model, image, layer_name='block5_conv3', H=224, W=224):
     
     cls=np.argmax(input_model.predict(image)) # Obtain the predicted class
     conv_output=input_model.get_layer(layer_name).output#Get the weights of the last output layer
     
     last_conv_layer_model=keras.Model(input_model.inputs, conv_output) #Create a model with the last output layer    
     class_weights=input_model.get_layer(layer_name).get_weights()[0] # Get the weights of the output layer\
     class_weights=class_weights[0,:,:,:]
     class_weights=np.mean(class_weights, axis=(0, 1))
     
     
     last_conv_output=last_conv_layer_model.predict(image) #The feature map output from last output layer
     last_conv_output=last_conv_output[0, :]
     cam=np.dot(last_conv_output, class_weights)
     
     
     cam=zoom(cam, H/cam.shape[0]) #Spatial Interpolation/zooming to image size
     cam=cam/np.max(cam) #Normalizing the gradcam
     
     returncam

然而 CAM 有一个最大的毛病就是必须从新训练模型能力失去全局均匀池化后失去的权重。对于每一类必须学习一个线性模型。也就是说将有 n 个权重(等于最初一层的过滤器)* n 个线性模型(等于类)。并且还必须批改网络架构来创立 CAM 这对于现有的模型来说改变太大，所以 Grad-CAM 解决了这些毛病。

Grad-CAM 背地的思维是，依赖于最初一个卷积层的特色映射中应用的梯度，而不是应用网络权重。这些梯度是通过反向流传失去的。

这不仅解决了再训练问题，还解决了网络架构批改问题，因为只应用梯度而不应用 GAP 层。

咱们只有在最初一个卷积层中计算用于顶部预测类的特色映射的梯度。而后咱们对这些权重利用全局均匀。权重与最初一层失去的特色映射的点积就是 Grad-CAM 输入。而后通过在其上利用 ReLU，辨认图像中仅对咱们的图像有踊跃奉献的局部。

最初就是将 Grad-CAM 调整为图像大小并规范化，以便它能够叠加在图像上。

 defgrad_cam(input_model, image, layer_name='block5_conv3',H=224,W=224):
     
     cls=np.argmax(input_model.predict(image)) #Get the predicted class
     y_c=input_model.output[0, cls] #Probability Score
     conv_output=input_model.get_layer(layer_name).output#Tensor of the last layer of cnn
     grads=K.gradients(y_c, conv_output)[0] #Gradients of the predicted class wrt conv_output layer
     
     get_output=K.function([input_model.input], [conv_output, grads]) 
     output, grads_val=get_output([image]) #Gives output of image till conv_output layer and the gradient values at that level
     output, grads_val=output[0, :], grads_val[0, :, :, :]
     
     
     weights=np.mean(grads_val, axis=(0, 1)) #Mean of gradients which acts as our weights
     cam=np.dot(output, weights) #Grad-CAM output
     
     cam=np.maximum(cam, 0) #Applying Relu
     cam=zoom(cam,H/cam.shape[0]) #Spatial Interpolation/zooming to image size
     cam=cam/cam.max() #Normalizing the gradcam
     
     returncam

Grad-CAM++ 不仅包含 gradcam 技术，它减少了疏导反向流传，只通过类别预测的正梯度进行反向流传。

Grad-CAM++ 这种优化的起因是因为 Grad-CAM 在辨认和关注屡次呈现的对象或具备低空间占用的对象方面存在问题。

所以 Grad-CAM++ 给予与预测类相干的梯度像素更多的重要性（正梯度），通过应用更大的因子而不是像 Grad-CAM 那样应用常数因子来缩放它们。这个比例因子在代码中用 alpha 示意。

 defgrad_cam_plus(input_model, image, layer_name='block5_conv3',H=224,W=224):
     
     cls=np.argmax(input_model.predict(image))
     y_c=input_model.output[0, cls]
     conv_output=input_model.get_layer(layer_name).output
     grads=K.gradients(y_c, conv_output)[0]
     
     first=K.exp(y_c)*grads#Variables used to calculate first second and third gradients
     second=K.exp(y_c)*grads*grads
     third=K.exp(y_c)*grads*grads*grads
 
     #Gradient calculation
     get_output=K.function([input_model.input], [y_c,first,second,third, conv_output, grads])
     y_c, conv_first_grad, conv_second_grad,conv_third_grad, conv_output, grads_val=get_output([img])
     global_sum=np.sum(conv_output[0].reshape((-1,conv_first_grad[0].shape[2])), axis=0)
 
     #Used to calculate the alpha values for each spatial location
     alpha_num=conv_second_grad[0]
     alpha_denom=conv_second_grad[0]*2.0+conv_third_grad[0]*global_sum.reshape((1,1,conv_first_grad[0].shape[2]))
     alpha_denom=np.where(alpha_denom!=0.0, alpha_denom, np.ones(alpha_denom.shape))
     alphas=alpha_num/alpha_denom
     
     #Calculating the weights and alpha's which is the scale at which we multiply the weights with more importance
     weights=np.maximum(conv_first_grad[0], 0.0)
     alpha_normalization_constant=np.sum(np.sum(alphas, axis=0),axis=0)
     alphas/=alpha_normalization_constant.reshape((1,1,conv_first_grad[0].shape[2])) #Normalizing alpha
     
     #Weights with alpha multiplied to get spatial importance
     deep_linearization_weights=np.sum((weights*alphas).reshape((-1,conv_first_grad[0].shape[2])),axis=0)
     
     grad_CAM_map=np.sum(deep_linearization_weights*conv_output[0], axis=2) #Grad-CAM++ map
     cam=np.maximum(grad_CAM_map, 0)
     cam=zoom(cam,H/cam.shape[0])
     cam=cam/np.max(cam) 
     
     returncam

这里咱们应用 VGG16，对一些图像进行了比拟，下图中能够看到 CAM、Grad-CAM 和 Grad-CAM++ 的认识有如许不同。尽管它们都次要集中在它的上半身，但 Grad-CAM++ 可能将其整体视为重要局部，而 CAM 则将其某些局部视为十分重要的特色，而将一些局部视为其预测的辅助。而 Grad-CAM 只关注它的冠和翅膀作为决策的重要特色。

对于这张风筝的图像，CAM 显示它关注的是除了风筝之外的所有货色（也就是天空），然而应用 gradcam 则看到到模型关注的是风筝，而 gradcam ++ 通过减少重要的突出空间进一步增强了这一点。这里须要留神的是，模型谬误地将其分类为降落伞，但风筝类紧随其后。也就是说，其实 CAM 更好的捕捉到了谬误的起因。

更多的代码和示例请看这里：

https://avoid.overfit.cn/post/2c9e1b0f993942c287d56df41325bc4f

作者：Tanishq Sardana

正文完

人工智能

发表至：人工智能

2023-06-08

0

关于人工智能:深度学习在时间序列预测的总结和未来方向分析

关于人工智能:小红书-x-Hugging-Face-邀请你一起晒创意新春照

关于人工智能:MindSpore-报错-Should-not-use-Python-in-runtime

关于人工智能:关于中国的医疗改革看人工智能chatGPT怎么回答

关于开源:开源赋能-工业铸魂-2023-开放原子全球开源峰会开源工业软件分论坛即将启幕

关于人工智能:CAM-GradCAM-GradCAM可视化CNN方式的代码实现和对比

Class Activation Maps

Grad-CAM（Gradient Weighted Class Activation Mapping）

Grad-CAM++

后果比照

Just My Socks（注册教程内含优惠码）

关于人工智能:CAM-GradCAM-GradCAM可视化CNN方式的代码实现和对比

Class Activation Maps

Grad-CAM（Gradient Weighted Class Activation Mapping）

Grad-CAM++

后果比照

Just My Socks（注册教程 内含优惠码）

Just My Socks（注册教程内含优惠码）