Pytorch Vs Tensorflow on MNIST Dataset

Aditya Mangal
4 min readFeb 22, 2023

There is a long war going on regarding the frameworks which can be used either on personal or industrial projects. Every framework has its own benefits according to the use case but every time, one question always raise on which framework we can start with.

Let’s compare the deep learning frameworks. I am going to compare both frameworks (Tensorflow and Pytorch)on MNIST Dataset and will try to reach some conclusions. I will take Pytorch Framework first.

First import libraries that will be used in the Pytorch framework

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
from torch.optim.lr_scheduler import StepLR
from tensorflow.keras.utils import plot_model

Loading MNIST Dataset

# pre-processor
transform = transforms.Compose([
transforms.Resize((28, 28)),
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))])

# load the data
train_dataset = datasets.MNIST(
'data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(
'data', train=False, download=True, transform=transform)

# make loaders for data
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32)

Build the Model

# Build a model
class CNNModel(nn.Module):
def __init__(self):
super(CNNModel, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.fc1 = nn.Linear(9216,1024)
self.fc = nn.Linear(1024, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = self.fc(x)
output = F.log_softmax(x, dim=1)
return output
net = CNNModel()
optimizer = optim.SGD(net.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()

Visualizing the Model Architecture

Training code

train_losses = []
train_counter = []
test_losses = []
test_counter = [i*len(train_loader.dataset) for i in range(5 + 1)]

def train(epoch):
net.train()
for batch_idx, (data, target) in enumerate(train_loader):
optimizer.zero_grad()
output = net(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 10 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))

train_counter.append(
(batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))
train_losses.append(loss.item())

def test():
net.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
output = net(data)
test_loss += F.nll_loss(output, target, size_average=False).item()
pred = output.data.max(1, keepdim=True)[1]
correct += pred.eq(target.data.view_as(pred)).sum()
test_loss /= len(test_loader.dataset)
test_losses.append(test_loss)
print('\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))

Training the model

test()
for epoch in range(1,6):
train(epoch)
test()
epochs

Total time to run on 5 epochs — 745.189 sec and test accuracy — 98%

Let's begin the war with Tensorflow

First import libraries that will be used in the Tensorflow framework

import tensorflow
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Conv2D, Flatten, Dense, MaxPooling2D
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
from tensorflow.keras.utils import plot_model

MNIST Dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# scalling
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train = x_train / 255.0
x_test = x_test / 255.0

Building the Model

model = Sequential()
model.add(Conv2D(32, (3,3), input_shape = (28,28,1), activation='relu'))
model.add(Conv2D(64,(3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dense(10, activation='softmax'))
# compile
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])

The architecture will be the same for both frameworks

Training the model

history = model.fit(x_train, y_train, validation_split=0.3, epochs=5,batch_size=32)
epochs

Total time to run on 5 epochs — 40.925 sec and test accuracy — 97%

Now, we have reached to end of the war and we can easily conclude in reference to coding and training time. In Tensorflow, building a model is very simple as compared to PyTorch as in PyTorch, many people can feel complexity while defining layers and activation functions. And, when we talk about training time, Tensorflow is clearly the winner. But every framework has its own advantage, I am not claiming TensorFlow is the best but for my use cases, Tensorflow is the best framework in reference to training, maintenance, and scalability.

Don’t forget to give your opinion on the advantage/disadvantages of frameworks. You can also comment if there is any trick/tips in frameworks

--

--

Aditya Mangal

My Personal Quote to overcome problems and remove dependencies - "It's not the car, it's the driver who win the race".