4.2 - Convolutional Neural Networks
Contents
4.2 - Convolutional Neural Networks¶
!wget -nc --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/2021.deeplearning/main/content/init.py
import init; init.init(force_download=False);
import tensorflow as tf
from time import time
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from local.lib import mlutils
%matplotlib inline
/opt/anaconda2/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
from numpy.core.umath_tests import inner1d
Image analytics tasks¶
from IPython.display import Image
Image(filename='local/imgs/imgs_tasks.jpeg', width=800)
Explore COCO Dataset¶
also
Image APIs Clarifai Amazon Rekognition, Google Cloud Vision
Image Captioning (con CNN + RNN!!!) caption bot
Convolutional Neural Networks¶
see https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd
see filter activation demo confusion matrix
see The 9 Deep Learning Papers You Should Know
RECOMMENDATION¶
close all applications
install Maxthon browser http://www.maxthon.com
open only VirtualBox and Maxthon
First level filters and activations maps¶
the filters in the middle are applied to the image on the left. Observe, for instance, in what parts of the image the seventh filter of the first row is activated (the one before the last one in the first row).
Image(filename='local/imgs/cnn_swan.png', width=800)
Hierarchy of filters and activation maps¶
Image(filename='local/imgs/cnn_features.png', width=600)
Image(filename='local/imgs/conv1.jpg', width=800)
Image(filename='local/imgs/conv2.jpg', width=800)
otros ejemplos de filtros de primer nivel
Image(filename='local/imgs/cnn_features2.png', width=600)
We have a small image dataset based on CIFAR-10, where each image size is 32x32x3.
!wget -nc https://s3.amazonaws.com/rlx/mini_cifar.h5
File ‘mini_cifar.h5’ already there; not retrieving.
import h5py
with h5py.File('mini_cifar.h5','r') as h5f:
x_cifar = h5f["x"][:]
y_cifar = h5f["y"][:]
mlutils.show_labeled_image_mosaic(x_cifar, y_cifar)
print (np.min(x_cifar), np.max(x_cifar))
0.0 1.0
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x_cifar, y_cifar, test_size=.25)
print (x_train.shape, y_train.shape, x_test.shape, y_test.shape)
print ("\ndistribution of train classes")
print (pd.Series(y_train).value_counts())
print ("\ndistribution of test classes")
print (pd.Series(y_test).value_counts())
(2253, 32, 32, 3) (2253,) (751, 32, 32, 3) (751,)
distribution of train classes
2 778
0 750
1 725
dtype: int64
distribution of test classes
0 255
1 249
2 247
dtype: int64
build a Keras model
def get_conv_model_A(num_classes, img_size=32, compile=True):
tf.keras.backend.clear_session()
print ("using",num_classes,"classes")
inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
layers = tf.keras.layers.Conv2D(15,(3,3), activation="relu", padding="SAME")(inputs)
layers = tf.keras.layers.Flatten()(layers)
layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
layers = tf.keras.layers.Dropout(0.2)(layers)
predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
model = tf.keras.Model(inputs = inputs, outputs=predictions)
if compile:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
num_classes = len(np.unique(y_cifar))
model = get_conv_model_A(num_classes)
using 3 classes
observe the weights initialized and their weights
weights = model.get_weights()
for i in weights:
print (i.shape)
(3, 3, 3, 15)
(15,)
(15360, 16)
(16,)
(16, 3)
(3,)
we keep the filters on the first layer to later compare them with the same filters after training.
initial_w0 = model.get_weights()[0].copy()
y_test.shape, y_train.shape, x_test.shape, x_train.shape
((751,), (2253,), (751, 32, 32, 3), (2253, 32, 32, 3))
num_classes = len(np.unique(y_cifar))
def train(model, batch_size, epochs, model_name=""):
tensorboard = tf.keras.callbacks.TensorBoard(log_dir="logs/"+model_name+"_"+"{}".format(time()))
model.reset_states()
model.fit(x_train, y_train, epochs=epochs, callbacks=[tensorboard],
batch_size=batch_size,
validation_data=(x_test, y_test))
metrics = model.evaluate(x_test, y_test)
return {k:v for k,v in zip (model.metrics_names, metrics)}
observe the shapes of model weights obtained above and try to see how they are related to the output shape and the number of parameters
model = get_conv_model_A(num_classes)
model.summary()
using 3 classes
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 32, 32, 3) 0
_________________________________________________________________
conv2d (Conv2D) (None, 32, 32, 15) 420
_________________________________________________________________
flatten (Flatten) (None, 15360) 0
_________________________________________________________________
dense (Dense) (None, 16) 245776
_________________________________________________________________
dropout (Dropout) (None, 16) 0
_________________________________________________________________
output_1 (Dense) (None, 3) 51
=================================================================
Total params: 246,247
Trainable params: 246,247
Non-trainable params: 0
_________________________________________________________________
train(model, batch_size=32, epochs=10, model_name="model_A")
Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 1s 490us/step - loss: 0.9741 - acc: 0.5109 - val_loss: 0.8331 - val_acc: 0.6498
Epoch 2/10
2253/2253 [==============================] - 1s 430us/step - loss: 0.8445 - acc: 0.6130 - val_loss: 0.7991 - val_acc: 0.6618
Epoch 3/10
2253/2253 [==============================] - 1s 465us/step - loss: 0.7801 - acc: 0.6338 - val_loss: 0.7409 - val_acc: 0.6804
Epoch 4/10
2253/2253 [==============================] - 1s 371us/step - loss: 0.7336 - acc: 0.6755 - val_loss: 0.7352 - val_acc: 0.6818
Epoch 5/10
2253/2253 [==============================] - 1s 486us/step - loss: 0.7217 - acc: 0.6826 - val_loss: 0.6991 - val_acc: 0.7071
Epoch 6/10
2253/2253 [==============================] - 1s 538us/step - loss: 0.6801 - acc: 0.7057 - val_loss: 0.6940 - val_acc: 0.6991
Epoch 7/10
2253/2253 [==============================] - 1s 383us/step - loss: 0.6133 - acc: 0.7266 - val_loss: 0.6788 - val_acc: 0.7044
Epoch 8/10
2253/2253 [==============================] - 1s 446us/step - loss: 0.5957 - acc: 0.7483 - val_loss: 0.6807 - val_acc: 0.7137
Epoch 9/10
2253/2253 [==============================] - 2s 702us/step - loss: 0.5585 - acc: 0.7581 - val_loss: 0.6566 - val_acc: 0.7217
Epoch 10/10
2253/2253 [==============================] - 2s 720us/step - loss: 0.5489 - acc: 0.7386 - val_loss: 0.6471 - val_acc: 0.7270
751/751 [==============================] - 0s 251us/step
{'acc': 0.7270306261496918, 'loss': 0.6470887158586881}
test_preds = model.predict(x_test).argmax(axis=1)
mlutils.plot_confusion_matrix(y_test, test_preds, classes=np.r_[0,1,2], normalize=True)
Normalized confusion matrix
[[0.69411765 0.16862745 0.1372549 ]
[0.04016064 0.85140562 0.10843373]
[0.22267206 0.1417004 0.63562753]]
<matplotlib.axes._subplots.AxesSubplot at 0x7efe9e4a1dd0>
observe the outp in tensorboard
tensorboard --logdir logs
first layer filters before training
mlutils.display_imgs(initial_w0)
and after training
w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)
(3, 3, 3, 15)
idxs = np.random.permutation(len(x_test))[:5]
preds = model.predict(x_test[idxs])
mlutils.show_preds(x_test[idxs],y_test[idxs], preds)
Let’s try a more complex network¶
def get_conv_model_B(num_classes, img_size=32, compile=True):
tf.keras.backend.clear_session()
print ("using",num_classes,"classes")
inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
layers = tf.keras.layers.Conv2D(15,(5,5), activation="relu")(inputs)
layers = tf.keras.layers.MaxPool2D((2,2))(layers)
layers = tf.keras.layers.Conv2D(60,(5,5), activation="relu")(layers)
layers = tf.keras.layers.Flatten()(layers)
layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
layers = tf.keras.layers.Dropout(0.2)(layers)
predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
model = tf.keras.Model(inputs = inputs, outputs=predictions)
if compile:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
model = get_conv_model_B(num_classes)
model.summary()
train(model, batch_size=32, epochs=10, model_name="model_B")
using 3 classes
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 32, 32, 3) 0
_________________________________________________________________
conv2d (Conv2D) (None, 28, 28, 15) 1140
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 15) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 10, 10, 60) 22560
_________________________________________________________________
flatten (Flatten) (None, 6000) 0
_________________________________________________________________
dense (Dense) (None, 16) 96016
_________________________________________________________________
dropout (Dropout) (None, 16) 0
_________________________________________________________________
output_1 (Dense) (None, 3) 51
=================================================================
Total params: 119,767
Trainable params: 119,767
Non-trainable params: 0
_________________________________________________________________
Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.9930 - acc: 0.4918 - val_loss: 0.9041 - val_acc: 0.5726
Epoch 2/10
2253/2253 [==============================] - 2s 878us/step - loss: 0.8651 - acc: 0.5832 - val_loss: 0.7849 - val_acc: 0.6431
Epoch 3/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.7790 - acc: 0.6316 - val_loss: 0.7374 - val_acc: 0.6551
Epoch 4/10
2253/2253 [==============================] - 4s 2ms/step - loss: 0.7289 - acc: 0.6587 - val_loss: 0.6836 - val_acc: 0.7310
Epoch 5/10
2253/2253 [==============================] - 5s 2ms/step - loss: 0.6608 - acc: 0.7079 - val_loss: 0.6550 - val_acc: 0.7097
Epoch 6/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.5975 - acc: 0.7386 - val_loss: 0.6259 - val_acc: 0.7430
Epoch 7/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.5508 - acc: 0.7590 - val_loss: 0.6558 - val_acc: 0.7031
Epoch 8/10
2253/2253 [==============================] - 4s 2ms/step - loss: 0.5065 - acc: 0.7852 - val_loss: 0.6555 - val_acc: 0.7177
Epoch 9/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.4732 - acc: 0.7994 - val_loss: 0.6463 - val_acc: 0.7350
Epoch 10/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.4236 - acc: 0.8282 - val_loss: 0.5843 - val_acc: 0.7537
751/751 [==============================] - 0s 580us/step
{'acc': 0.7536617846844517, 'loss': 0.5843058094362444}
w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)
(5, 5, 3, 15)
or with larger filters¶
def get_conv_model_C(num_classes, img_size=32, compile=True):
tf.keras.backend.clear_session()
print ("using",num_classes,"classes")
inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
layers = tf.keras.layers.Conv2D(96,(11,11), activation="relu")(inputs)
layers = tf.keras.layers.MaxPool2D((2,2))(layers)
layers = tf.keras.layers.Conv2D(60,(11,11), activation="relu")(layers)
layers = tf.keras.layers.Flatten()(layers)
layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
layers = tf.keras.layers.Dropout(0.2)(layers)
predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
model = tf.keras.Model(inputs = inputs, outputs=predictions)
if compile:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
model = get_conv_model_C(num_classes)
model.summary()
train(model, batch_size=32, epochs=10, model_name="model_C")
using 3 classes
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 32, 32, 3) 0
_________________________________________________________________
conv2d (Conv2D) (None, 22, 22, 96) 34944
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 11, 11, 96) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 1, 1, 60) 697020
_________________________________________________________________
flatten (Flatten) (None, 60) 0
_________________________________________________________________
dense (Dense) (None, 16) 976
_________________________________________________________________
dropout (Dropout) (None, 16) 0
_________________________________________________________________
output_1 (Dense) (None, 3) 51
=================================================================
Total params: 732,991
Trainable params: 732,991
Non-trainable params: 0
_________________________________________________________________
Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 11s 5ms/step - loss: 1.0610 - acc: 0.4159 - val_loss: 0.9639 - val_acc: 0.5300
Epoch 2/10
2253/2253 [==============================] - 11s 5ms/step - loss: 0.9590 - acc: 0.5317 - val_loss: 0.9168 - val_acc: 0.5566
Epoch 3/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8939 - acc: 0.5584 - val_loss: 0.9846 - val_acc: 0.4940
Epoch 4/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8659 - acc: 0.5899 - val_loss: 0.9343 - val_acc: 0.5459
Epoch 5/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8456 - acc: 0.6165 - val_loss: 0.8306 - val_acc: 0.6325
Epoch 6/10
2253/2253 [==============================] - 10s 5ms/step - loss: 0.8022 - acc: 0.6303 - val_loss: 0.8122 - val_acc: 0.6192
Epoch 7/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7941 - acc: 0.6343 - val_loss: 0.7889 - val_acc: 0.6325
Epoch 8/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7490 - acc: 0.6613 - val_loss: 0.7288 - val_acc: 0.6818
Epoch 9/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7323 - acc: 0.6600 - val_loss: 0.7660 - val_acc: 0.6591
Epoch 10/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7040 - acc: 0.6747 - val_loss: 0.7095 - val_acc: 0.6791
751/751 [==============================] - 1s 1ms/step
{'acc': 0.6790945410093518, 'loss': 0.7095216961103813}
w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)
(11, 11, 3, 96)
i = np.random.randint(len(x_test))
plt.imshow(x_test[i])
plt.axis("off");
acts = mlutils.get_activations(model, x_test[i:i+1])["conv2d/Relu:0"][0]
plt.figure(figsize=(10,10))
for i in range(acts.shape[-1]):
plt.subplot(10,10,i+1)
plt.imshow(acts[:,:,i], cmap=plt.cm.Greys_r )
plt.axis("off")
idxs = np.random.permutation(len(x_test))[:5]
preds = model.predict(x_test[idxs])
mlutils.show_preds(x_test[idxs],y_test[idxs], preds)
see
Class activation maps https://jacobgil.github.io/deeplearning/class-activation-maps