convolution - Fully-connected layer weight dimensions in TensorFlow ConvNet -
i've been coding along this example of convolution net in tensorflow , i'm mystified allocation of weights:
weights = { # 5x5 conv, 1 input, 32 outputs 'wc1': tf.variable(tf.random_normal([5, 5, 1, 32])), # 5x5 conv, 32 inputs, 64 outputs 'wc2': tf.variable(tf.random_normal([5, 5, 32, 64])), # connected, 7*7*64 inputs, 1024 outputs 'wd1': tf.variable(tf.random_normal([7*7*64, 1024])), # 1024 inputs, 10 outputs (class prediction) 'out': tf.variable(tf.random_normal([1024, n_classes])) }
how know 'wd1' weight matrix should have 7 x 7 x 64 rows?
it's later used reshape output of second convolution layer:
# connected layer # reshape conv2 output fit dense layer input dense1 = tf.reshape(conv2, [-1, _weights['wd1'].get_shape().as_list()[0]]) # relu activation dense1 = tf.nn.relu(tf.add(tf.matmul(dense1, _weights['wd1']), _biases['bd1']))
by math, pooling layer 2 (conv2 output) has 4 x 4 x 64 neurons.
why reshaping [-1, 7*7*64]?
working start:
the input, _x
of size [28x28x1]
(ignoring batch dimension). 28x28 greyscale image.
the first convolutional layer uses padding=same
, outputs 28x28 layer, passed max_pool
k=2
, reduces each dimension factor of two, resulting in 14x14 spatial layout. conv1 has 32 outputs -- full per-example tensor [14x14x32]
.
this repeated in conv2
, has 64 outputs, resulting in [7x7x64]
.
tl;dr: image starts 28x28, , each maxpool reduces factor of 2 in each dimension. 28/2/2 = 7.
Comments
Post a Comment