有哪些在线做图的网站公司网站策划宣传
GCN是特别常见的图神经网络模型,这个模型在各种图神经网络的开源库都有实现,例如DGL,PYG。但是开源库里面的实现,基本上是空域的图卷积操作,意思是说DGL和PYG里面的邻接矩阵 AAA 都必须是硬定义的。这里的良性定义是指:Aij∈{0,1}A_{ij}\in\{0,1\}Aij∈{0,1},每个元素非0即1,必须能够确定性的知道是否存在节点 iii 和节点 jjj直接的连边。如果我们想要跑soft 的邻接矩阵 Aij∈[0,1]A_{ij}\in[0,1]Aij∈[0,1],这个矩阵里面的元素是连续型的实数,此时DGL和PYG就不得行了。
然而,根据最原始的GCN的定义,邻接矩阵是不要必须是hard的,因此本人考虑使用最原始的GCN,直接跑矩阵运算。
GCN的公式为:
其中AAA 是邻接矩阵,INI_NIN 是单位矩阵,注意里面的D^−0.5A^D^−0.5\hat D^{-0.5} \hat A \hat D^{-0.5}D^−0.5A^D^−0.5是归一化拉普拉斯矩阵的近似。
最原始的GCN可以参见:https://github.com/tkipf/pygcn
我们摘录其中的实现:
class GraphConvolution(Module):"""Simple GCN layer, similar to https://arxiv.org/abs/1609.02907"""def __init__(self, in_features, out_features, bias=True):super(GraphConvolution, self).__init__()self.in_features = in_featuresself.out_features = out_featuresself.weight = Parameter(torch.FloatTensor(in_features, out_features))if bias:self.bias = Parameter(torch.FloatTensor(out_features))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self):stdv = 1. / math.sqrt(self.weight.size(1))self.weight.data.uniform_(-stdv, stdv)if self.bias is not None:self.bias.data.uniform_(-stdv, stdv)def forward(self, input, adj):support = torch.mm(input, self.weight)output = torch.spmm(adj, support)if self.bias is not None:return output + self.biaselse:return outputdef __repr__(self):return self.__class__.__name__ + ' (' \+ str(self.in_features) + ' -> ' \+ str(self.out_features) + ')'
其中邻接矩阵处理在 https://github.com/tkipf/pygcn/blob/master/pygcn/utils.py
def normalize(mx):"""Row-normalize sparse matrix"""rowsum = np.array(mx.sum(1))r_inv = np.power(rowsum, -1).flatten()r_inv[np.isinf(r_inv)] = 0.r_mat_inv = sp.diags(r_inv)mx = r_mat_inv.dot(mx)return mx
我们注意到,源码里面实际的实现是用 D^−1A^\hat D^{-1} \hat AD^−1A^去近似归一化拉普拉斯矩阵。
这种类似的处理在Graph_Transformer_Networks也有出现:https://github.com/jmhIcoding/Graph_Transformer_Networks/blob/master/model.py
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import math
from matplotlib import pyplot as plt
import pdbclass GTN(nn.Module):def __init__(self, num_edge, num_channels, w_in, w_out, num_class,num_layers,norm):super(GTN, self).__init__()self.num_edge = num_edgeself.num_channels = num_channelsself.w_in = w_inself.w_out = w_outself.num_class = num_classself.num_layers = num_layersself.is_norm = normlayers = []for i in range(num_layers):if i == 0:layers.append(GTLayer(num_edge, num_channels, first=True))else:layers.append(GTLayer(num_edge, num_channels, first=False))self.layers = nn.ModuleList(layers)self.weight = nn.Parameter(torch.Tensor(w_in, w_out))self.bias = nn.Parameter(torch.Tensor(w_out))self.loss = nn.CrossEntropyLoss()self.linear1 = nn.Linear(self.w_out*self.num_channels, self.w_out)self.linear2 = nn.Linear(self.w_out, self.num_class)self.reset_parameters()def reset_parameters(self):nn.init.xavier_uniform_(self.weight)nn.init.zeros_(self.bias)def gcn_conv(self,X,H):X = torch.mm(X, self.weight)H = self.norm(H, add=True)return torch.mm(H.t(),X)def normalization(self, H):for i in range(self.num_channels):if i==0:H_ = self.norm(H[i,:,:]).unsqueeze(0)else:H_ = torch.cat((H_,self.norm(H[i,:,:]).unsqueeze(0)), dim=0)return H_def norm(self, H, add=False):H = H.t()if add == False:H = H*((torch.eye(H.shape[0])==0).type(torch.FloatTensor))else:H = H*((torch.eye(H.shape[0])==0).type(torch.FloatTensor)) + torch.eye(H.shape[0]).type(torch.FloatTensor)deg = torch.sum(H, dim=1)deg_inv = deg.pow(-1)deg_inv[deg_inv == float('inf')] = 0deg_inv = deg_inv*torch.eye(H.shape[0]).type(torch.FloatTensor)H = torch.mm(deg_inv,H)H = H.t()return Hdef forward(self, A, X, target_x, target):A = A.unsqueeze(0).permute(0,3,1,2) Ws = []for i in range(self.num_layers):if i == 0:H, W = self.layers[i](A)else:H = self.normalization(H)H, W = self.layers[i](A, H)Ws.append(W)#H,W1 = self.layer1(A)#H = self.normalization(H)#H,W2 = self.layer2(A, H)#H = self.normalization(H)#H,W3 = self.layer3(A, H)for i in range(self.num_channels):if i==0:X_ = F.relu(self.gcn_conv(X,H[i]))else:X_tmp = F.relu(self.gcn_conv(X,H[i]))X_ = torch.cat((X_,X_tmp), dim=1)X_ = self.linear1(X_)X_ = F.relu(X_)y = self.linear2(X_[target_x])loss = self.loss(y, target)return loss, y, Ws
注意里面的norm。
问题: D^−0.5A^D^−0.5\hat D^{-0.5} \hat A \hat D^{-0.5}D^−0.5A^D^−0.5 和 D^−1A^\hat D^{-1} \hat AD^−1A^ 有啥区别呢?
答:D^−1A^\hat D^{-1} \hat AD^−1A^ 是随机游走矩阵,表示用当前邻居的特征值的算术平均数去更新当前节点。
D^−0.5A^D^−0.5\hat D^{-0.5} \hat A \hat D^{-0.5}D^−0.5A^D^−0.5是拉普拉斯平滑,表示用当前节点与所有邻居的特征值之差的和去更新当前节点。