Convolutional Neural Network (CNN) model using TensorFlow Code Explanation for CIFAR-10 Dataset

Categories: Machine Learning

Tags:

Convolutional Neural Network (CNN) model using TensorFlow Code Explanation for CIFAR-10 Dataset

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(100, activation='softmax')  # 100 classes for CIFAR-100
])

This code snippet defines a Convolutional Neural Network (CNN) model using TensorFlow and Keras. It’s a simple CNN architecture typically used for image classification tasks like those in the CIFAR-10 or CIFAR-100 datasets. Let’s break down each part of the model definition:

Complete Code is available here : https://github.com/slidescope/machine-learning

1. Sequential Model

model = models.Sequential([

This defines a Sequential model, meaning that layers will be added one after another in a linear fashion (i.e., the output of one layer will be the input to the next).
It’s a simple and common way to build neural networks, especially for tasks where the data flows in a linear sequence.

2. Conv2D Layer

layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),

Conv2D: This is a 2D Convolutional Layer, which is the core building block for CNNs used to detect patterns in images.
32: The number of filters (also called kernels) to use. Each filter detects a different feature (e.g., edges, corners).
(3, 3): The size of the filter. In this case, the filter is a 3×3 grid that will slide over the image to perform the convolution operation.
activation=’relu’: The activation function used here is ReLU (Rectified Linear Unit). It outputs the input directly if it’s positive, or 0 if it’s negative. It introduces non-linearity to help the network learn complex patterns.
input_shape=(32, 32, 3): Specifies the shape of the input image. For CIFAR-10 or CIFAR-100, images are 32×32 pixels with 3 color channels (RGB). This is the input to the model.

3. MaxPooling2D Layer

layers.MaxPooling2D((2, 2)),

MaxPooling2D: This is a Max Pooling Layer, which reduces the spatial dimensions of the image. It helps reduce the number of parameters and computation in the model, as well as preventing overfitting.
(2, 2): Specifies the size of the pooling window. It reduces the dimensions of the feature map by taking the maximum value from each 2×2 grid.

4. Second Conv2D Layer

layers.Conv2D(64, (3, 3), activation='relu'),

This is another Convolutional Layer, but it uses 64 filters and the same 3×3 filter size.
The increase in the number of filters from 32 to 64 allows the model to learn more complex and abstract features as the network deepens.

5. Second MaxPooling2D Layer

layers.MaxPooling2D((2, 2)),

Another MaxPooling Layer follows the second convolutional layer, reducing the spatial dimensions again.

6. Third Conv2D Layer

layers.Conv2D(64, (3, 3), activation='relu'),

This layer again has 64 filters and a 3×3 filter size.
By now, the model is learning increasingly abstract features at higher levels of the network.

7. Flatten Layer

layers.Flatten(),

Flatten: This layer flattens the 2D matrix of features into a 1D vector. It’s necessary because the next layers (dense layers) expect 1D input, not a 2D matrix.
For example, if the previous layer output a 3D tensor with shape (batch_size, height, width, channels), flattening turns this into a 1D array of size height * width * channels.

8. Dense Layer (Fully Connected)

layers.Dense(64, activation='relu'),

Dense: A fully connected layer where every neuron is connected to every neuron in the previous layer.
64: The number of neurons in this layer.
activation=’relu’: The activation function used here is ReLU again. It adds non-linearity to the model.
This layer will help the model combine features extracted by the convolutional layers to make a decision.

9. Final Dense Layer

layers.Dense(100, activation='softmax')

Dense(100): The final fully connected layer has 100 neurons, corresponding to the 100 possible classes in the CIFAR-100 dataset.
activation=’softmax’: The softmax activation function is used for multi-class classification. It converts the output into a probability distribution over the 100 classes, meaning the model will output a vector of 100 probabilities that sum to 1. The class with the highest probability is considered the predicted class.

Summary

The model is a Convolutional Neural Network (CNN) designed for image classification tasks.
It uses 3 convolutional layers, each followed by max-pooling, to extract features from the input images.
After the convolutional layers, the model flattens the features and passes them through two fully connected layers to make the final class prediction.
The output layer uses softmax to generate probabilities for the 100 classes in CIFAR-100.

This model is a basic CNN architecture. In practice, for better performance, more advanced architectures (e.g., ResNet, VGG) or techniques like data augmentation and dropout might be used.

After Creating this model follow these steps:

After defining the model architecture using Sequential(), the next steps involve compiling the model, training it, and evaluating its performance. Here’s the typical flow after defining the model:

1. Compile the Model

First, you need to compile the model. This step configures the model’s learning process, including specifying the optimizer, loss function, and evaluation metric. For multi-class classification, we use categorical cross-entropy as the loss function and accuracy as the metric.

model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

optimizer=’adam’: Adam is an adaptive optimizer that adjusts the learning rate during training. It is widely used in practice.
loss=’categorical_crossentropy’: This loss function is used for multi-class classification problems where the labels are one-hot encoded.
metrics=[‘accuracy’]: Accuracy is a common metric to evaluate model performance for classification tasks.

2. Train the Model

After compiling, you can train the model using the fit() method. This requires providing the training data (X_train, y_train) and validation data (X_test, y_test).

history = model.fit(X_train, y_train, 
                    epochs=10, 
                    batch_size=64, 
                    validation_data=(X_test, y_test))

X_train and y_train: The training data and labels.
epochs=10: The number of times the model will iterate over the entire training dataset.
batch_size=64: The number of samples that will be processed before the model’s internal parameters are updated.
validation_data=(X_test, y_test): The validation data, used to evaluate the model during training after each epoch.

3. Evaluate the Model

Once the model is trained, you can evaluate its performance on the test dataset (usually a separate dataset from the training set).

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
print(f'Test accuracy: {test_acc}')

X_test and y_test: The test dataset and labels.
The function will return the test loss and test accuracy.

4. Make Predictions

After training, you can use the trained model to make predictions on new data.

predictions = model.predict(X_test)
print(predictions)  # Output: Array of probabilities for each class

predictions will be a numpy array where each element is a vector of probabilities for each of the 100 classes (CIFAR-100).
You can find the predicted class by taking the argmax of the predictions.

predicted_classes = np.argmax(predictions, axis=1)

5. (Optional) Visualize Training History

To visualize the training process (e.g., loss and accuracy over epochs), you can plot the training history.

import matplotlib.pyplot as plt

# Plot training & validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

This helps in understanding how the model is learning and if it’s overfitting (i.e., high training accuracy but low validation accuracy).

6. Save the Model

You may want to save the trained model for later use, so you don’t need to retrain it every time.

model.save('cifar100_model.h5')

This will save the entire model (architecture, weights, and optimizer state) to a file named cifar100_model.h5.

Summary of the Steps

Compile the model: Set up the optimizer, loss function, and evaluation metric.
Train the model: Use fit() to train the model on the training data.
Evaluate the model: Assess the model’s performance on the test data using evaluate().
Make predictions: Use predict() to classify new data.
Visualize performance: Plot training/validation loss and accuracy.
Save the model: Save the trained model for later use.

If you need help with any specific part of the process, feel free to ask!