2019-08-01

MobileNet深度可分离卷积

这篇笔记记录MobileNet的深度可分离卷积操作的特点及实现。

深度可分离卷积是将输入的每个通道展开，在单个通道上做卷积，最后将结果合并。换句话说，通常的卷积层每个核要扫描输入的所有通道，而深度可分离卷积每个核只读输入的一个通道。与Inception 的分组卷积单元相似，不同的是每个卷积核只对输入的一个通道操作。

深度可分离卷积结构

假设深度可分离层的输入有3个通道，如图表示了该模块的过程：

对每个通道进行卷及操作

下面是就上图结构实现的模块：

实现

def separable_conv_block(x, output_channel_number, name):
    """
    x: input 
    output_channel_number: the output channel of the entire block, 
                            又是1x1卷据层的卷积核个数(卷积核个数==输出通道数)
    name: namespace
    """
    with tf.variable_scope(name):
        # get channel number:
        input_channel = x.get_shape().as_list()[-1]
        
        # split channels to a channel list:
        # channel_wise_x: [channel1, channel2, ...]
        channel_wise_x = tf.split(x, input_channel, axis = 3)
        
        # 对每一个通道分别执行3x3的卷积操作
        output_channels = []
        for i in range(len(channel_wise_x)):      
            output_channel = tf.layers.conv2d(channel_wise_x[i],
                                              1,
                                              (3, 3),
                                              strides = (1,1),
                                              padding = 'same',
                                              activation = tf.nn.relu,
                                              name = 'conv_%d' % i)
            output_channels.append(output_channel)
        
        # concat along channel(index=3)
        concat_layer = tf.concat(output_channels, axis = 3)
        
        # 经过一个1x1的卷积操作
        conv1_1 = tf.layers.conv2d(concat_layer,
                                   output_channel_number,
                                   (1,1),
                                   strides = (1,1),
                                   padding = 'same',
                                   activation = tf.nn.relu,
                                   name = 'conv1_1')
    return conv1_1

将上述模块放入MobileNet的网络结构就可以搭建完整的MobileNet。实例实现看这里