A A
[Paper Review] VGGNet Code ๊ตฌํ˜„ (By PyTorch)
๋…ผ๋ฌธ์„ ๊ณ„์† ์ฝ์–ด์•ผ์ง€ ์ฝ์–ด์•ผ์ง€ ์ƒ๊ฐํ•˜๋‹ค๊ฐ€.. ์šฉ๊ธฐ๋ฅผ ๋‚ด์–ด์„œ ํ•œ๋ฒˆ ์ฝ์–ด๋ณธ ๋‚ด์šฉ์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

VGGNet Review

๋…ผ๋ฌธ ๋ฆฌ๋ทฐํ•œ ๋‚ด์šฉ์€ ์•„๋ž˜ ๋งํฌ์— ๋‹ฌ์•„๋†“๊ฒ ์Šต๋‹ˆ๋‹ค!

 

[Paper Review] VGGnet Review

๋…ผ๋ฌธ์„ ๊ณ„์† ์ฝ์–ด์•ผ์ง€ ์ฝ์–ด์•ผ์ง€ ์ƒ๊ฐํ•˜๋‹ค๊ฐ€.. ์šฉ๊ธฐ๋ฅผ ๋‚ด์–ด์„œ ํ•œ๋ฒˆ ์ฝ์–ด๋ณธ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. VGGNet Paper (2014)VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION.๋…ผ๋ฌธ ์‚ฌ์ดํŠธ ๋งํฌ๋Š” ์•„๋ž˜

daehyun-bigbread.tistory.com


VGGNet Architecture 

๊ทธ๋Ÿฌ๋ฉด ํ•œ๋ฒˆ VGGNet์„ ์ฝ”๋“œ๋กœ ํ•œ๋ฒˆ ๊ตฌํ˜„์„ ํ•˜๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. - D์—ด์˜ ๋ชจ๋ธ(VGG16)์„ ๊ตฌํ˜„ํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค.

  • image input - 224 x 224 RGB
  • Convolution Stride - 1 pixel๋กœ ๊ณ ์ •
  • 3 x 3 Convolution ์—ฐ์‚ฐ x 2 (Channel 64)
  • maxpooling - 2 x 2 pixel ์ ์šฉ, Stride = 2
  • 3 x 3 Convolution ์—ฐ์‚ฐ x 2 (Channel 128)
  • maxpooling - 2 x 2 pixel ์ ์šฉ, Stride = 2
  • 3 x 3 Convolution ์—ฐ์‚ฐ x 3 (Channel 256)
  • maxpooling - 2 x 2 pixel ์ ์šฉ, Stride = 2
  • 3 x 3 Convolution ์—ฐ์‚ฐ x 3 (Channel 512)
  • maxpooling - 2 x 2 pixel ์ ์šฉ, Stride = 2
  • 3 x 3 Convolution ์—ฐ์‚ฐ x 3 (Channel 512)
  • maxpooling - 2 x 2 pixel ์ ์šฉ, Stride = 2
  • FC(Fully-Connected Layer) - 4096, ReLU
  • FC(Fully-Connected Layer) - 4096, ReLU
  • FC(Fully-Connected Layer) - 1000, SoftMax
์—ฌ๊ธฐ์„œ filter๋ฅผ 3 x 3์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋Š”? ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์ง€๊ณ , ๋น„์„ ํ˜•์„ฑ์ด ์ฆ๊ฐ€ํ•ด ์ด๋กœ์šด ์ ์ด ๋งŽ๋‹ค๋Š”์ ... (์š”์•ฝํ•˜๋ฉด ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค)

VGG16 ๊ตฌํ˜„ ํ•„์š” ๋‚ด์šฉ

VGG16์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ•„์š”ํ•œ ๋ถ€๋ถ„์€ ์•„๋ž˜์— ํ•จ๊ผ ์ •์˜๋ฅผ ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

VGG16์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋„คํŠธ์›Œํฌ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ •์˜ํ•˜๊ณ  ๋ชจ๋ธ์„ ์ปดํŒŒ์ผํ•œ ํ›„ ํ›ˆ๋ จ ๋ฐ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ ˆ์ฐจ๋ฅผ ์„ค์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

1. ๋ฐ์ดํ„ฐ ์ค€๋น„

๋ฐ์ดํ„ฐ๋ฅผ ํ›ˆ๋ จ, ๊ฒ€์ฆ, ํ…Œ์ŠคํŠธ ์„ธํŠธ๋กœ ๋‚˜๋ˆ„๊ณ , ์ „์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์— ์ž…๋ ฅํ•  ํ˜•ํƒœ๋กœ ์ค€๋น„ํ•ฉ๋‹ˆ๋‹ค.

2. ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ์ •์˜

  • ์ž…๋ ฅ ๋ ˆ์ด์–ด: ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ํฌ๊ธฐ (์˜ˆ: 224x224x3)
  • ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด: ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด (3x3 ํ•„ํ„ฐ)
  • ํ’€๋ง ๋ ˆ์ด์–ด: ์ฃผ๋กœ ์ตœ๋Œ€ ํ’€๋ง ๋ ˆ์ด์–ด (2x2 ํ’€๋ง)
  • ์™„์ „ ์—ฐ๊ฒฐ ๋ ˆ์ด์–ด(FC): ์ผ๋ฐ˜์ ์œผ๋กœ 2~3๊ฐœ์˜ ์™„์ „ ์—ฐ๊ฒฐ ๋ ˆ์ด์–ด
  • ์ถœ๋ ฅ ๋ ˆ์ด์–ด: Softmax ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์žˆ๋Š” ์ถœ๋ ฅ ๋ ˆ์ด์–ด (์˜ˆ: ํด๋ž˜์Šค ์ˆ˜๋งŒํผ์˜ ๋‰ด๋Ÿฐ)

3. ๋ชจ๋ธ ์ปดํŒŒ์ผ

  • ์†์‹ค ํ•จ์ˆ˜: ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ๋Š” ๋ณดํ†ต categorical_crossentropy๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ตํ‹ฐ๋งˆ์ด์ €: ์˜ˆ๋ฅผ ๋“ค์–ด, Adam, RMSprop, SGD ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ‰๊ฐ€ ์ง€ํ‘œ: ์ •ํ™•๋„(accuracy) ๋“ฑ์˜ ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

4. ๋ชจ๋ธ ํ›ˆ๋ จ

ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ํ•จ๊ป˜ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ๋ฐฐ์น˜ ํฌ๊ธฐ, ์—ํฌํฌ ์ˆ˜ ๋“ฑ์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

5. ๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก

ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ๊ฒ€์ฆ ๋ฐ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์—์„œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.


VGG16 ๊ตฌํ˜„ By PyTorch

๊ทธ๋Ÿฌ๋ฉด ํ•œ๋ฒˆ PyTorch๋กœ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฐ ๋ฐ์ดํ„ฐ ๋กœ๋“œ

์‹ค์ œ ๋…ผ๋ฌธ์—์„œ ์ฝ”๋“œ๋ฅผ ๋ณด๋ฉด ๋ฐ์ดํ„ฐ์…‹ ํด๋ž˜์Šค 1000๊ฐœ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ ๋ชจ๋ธ ์ฝ”๋“œ๋ฅผ ๋กœ์ปฌ์—์„œ ๋Œ๋ฆฌ๋Š” ๊ด€๊ณ„๋กœ ๋ฐ์ดํ„ฐ์…‹์ด 10๊ฐœ์ธ

CIFAR-10 ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.

 

CIFAR-10 and CIFAR-100 datasets

< Back to Alex Krizhevsky's home page The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. The CIFAR-10 dataset The CIFAR-10 dataset consists of 60000

www.cs.toronto.edu

 

ํ•œ๋ฒˆ ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฐ ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•ด์„œ ์ „์ฒ˜๋ฆฌ ํ•˜๋Š” ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

import torch
import torch.nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
# ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒ˜๋ฆฌ
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# CIFAR-10 Dataset download & load
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

 

Input๋œ Image Dataset์˜ Size๋ฅผ 224 x 224๋กœ ์ง€์ •ํ•œ ์ด์œ ๋Š”, ๋…ผ๋ฌธ์—์„œ image input - 224 x 224 RGB ์ด๋ฏ€๋กœ ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ ํฌ๊ธฐ๋ฅผ ์ง€์ •ํ•ด์ฃผ์–ด์„œ ๋„ฃ์–ด์ค˜์•ผ ํ•ฉ๋‹ˆ๋‹ค.

 

# ๋ฐ์ดํ„ฐ ํ™•์ธ
index = 1  # ํ™•์ธํ•  ๋ฐ์ดํ„ฐ ์ธ๋ฑ์Šค
image, label = trainset[index]  # ์ด๋ฏธ์ง€์™€ ๋ ˆ์ด๋ธ” ๋ถ„๋ฆฌ

# ์ด๋ฏธ์ง€๋ฅผ ์‹œ๊ฐํ™”ํ•˜๊ธฐ ์œ„ํ•ด numpy ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜
image_np = image.numpy().transpose((1, 2, 0))  # (C, H, W) -> (H, W, C)

# ์ด๋ฏธ์ง€ ์‹œ๊ฐํ™”
plt.imshow(image_np)

32 x 32 ์ด๋ฏธ์ง€์˜ ์‚ฌ์ด์ฆˆ๋ฅผ 224 x 224๋กœ ๋Š˜๋ ค์„œ ํ๋ฆฌ๊ฒŒ ๋ณด์ž…๋‹ˆ๋‹ค..

 


VGG16 Model Code

์•„๋ž˜๋Š” ๋ชจ๋ธ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
import torch.nn as nn

class VGG16(nn.Module):
  def __init__(self):
    super(VGG16, self).__init__()
    
    self.features = nn.Sequential(
        # Block 1 (2๊ฐœ 3x3 Convolution, 64 filter)
        nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1), # Input: (3, 224, 224) -> Output: (64, 224, 224)
        nn.ReLU(inplace=True),
        nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(kernel_size=2, stride=2), # Max pooling (2x2) with stride 2 -> Output: (64, 112, 112)

        # Block 2 (2๊ฐœ 3x3 Convolution, 128 filter)
        nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1), # Input (64, 112, 112) -> Output (128, 112, 112)
        nn.ReLU(inplace=True),
        nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(kernel_size=2, stride=2), # Max pooling (2x2) with stride 2 -> Output (128, 56, 56)

        # Block 3 (3๊ฐœ 3x3 Convolution, 256 filter)
        nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1), # Input (128, 56, 56) -> Output (256, 56, 56)
        nn.ReLU(inplace=True),
        nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
        nn.ReLU(inplace=True),
        nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
        nn.MaxPool2d(kernel_size=2, stride=2), # Max pooling (2x2) with stride 2 -> Output (256, 28, 28)

        # Block 4 (3๊ฐœ 3x3 Convolution, 512 filter)
        nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1), # Input (256, 28, 28) -> Output (512, 28, 28)
        nn.ReLU(inplace=True),
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
        nn.ReLU(inplace=True),
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
        nn.MaxPool2d(kernel_size=2, stride=2), # Max pooling (2x2) with stride 2 -> Output (512, 14, 14)

        # Block 5 (3๊ฐœ 3x3 Convolution, 512 filter)
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1), # Input (512, 14, 14) -> Output (512, 14, 14)
        nn.ReLU(inplace=True),
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
        nn.ReLU(inplace=True),
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
        nn.MaxPool2d(kernel_size=2, stride=2), # Max pooling (2x2) with stride 2 -> Output (512, 7, 7)
        )
    
    self.classifier = nn.Sequential(
        nn.Flatten(),
        nn.Linear(512 * 7 * 7, 4096), # First FC Layer (4096)
        nn.ReLU(inplace=True),
        nn.Linear(4096, 4096), # Second FC Layer (4096)
        nn.ReLU(inplace=True),
        nn.Linear(4096, 10), # Third FC Layer (1000) -> ์›๋ž˜ ๋…ผ๋ฌธ๋Œ€๋กœ ๋ผ๋ฉด 1000๊ฐœ์˜ ๋ฐ์ดํ„ฐ์…‹์ด ์žˆ์–ด์•ผ ํ•˜์ง€๋งŒ, ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ์…‹์ด ํด๋ž˜์Šค๊ฐ€ 10๊ฐœ์ด๋ฏ€๋กœ 10์œผ๋กœ ์ง€์ •
    )

  def forward(self, x):
    x = self.features(x)
    x = self.classifier(x)
    return x

model = VGG16()
print(model)
VGG16(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (16): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (17): ReLU(inplace=True)
    (18): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (19): ReLU(inplace=True)
    (20): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (21): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (22): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (23): ReLU(inplace=True)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=25088, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Linear(in_features=4096, out_features=10, bias=True)
  )
)

 

 

๋งˆ์ง€๋ง‰ FC Layer์— SoftMax๊ฐ€ ์ ์šฉ์ด ๋ฉ๋‹ˆ๋‹ค. ๊ทผ๋ฐ ์ถ”๊ฐ€๋ฅผ ์•ˆํ•œ ์ด์œ ๋Š” nn.CrossEntropyLoss์™€ ๊ฐ™์€ ์†์‹ค ํ•จ์ˆ˜์— Softmax๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ, ๋ชจ๋ธ์˜ ์ตœ์ข… ๋ ˆ์ด์–ด์—์„œ๋Š” ์ง์ ‘ ์ ์šฉํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ์˜ ์ถœ๋ ฅ์€ logits ํ˜•ํƒœ๋กœ, ์†์‹ค ํ•จ์ˆ˜๊ฐ€ ๋‚ด๋ถ€์ ์œผ๋กœ SoftMax๋ฅผ ์ ์šฉํ•˜์—ฌ ํด๋ž˜์Šค ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

 


Model Compile

Loss Function (์†์‹คํ•จ์ˆ˜), Optimizer ๋“ฑ์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
# ๋ชจ๋ธ ์ดˆ๊ธฐํ™”
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = VGG16().to(device)
# ์†์‹ค ํ•จ์ˆ˜์™€ ์˜ตํ‹ฐ๋งˆ์ด์ € ์ •์˜
criterion = nn.CrossEntropyLoss()  # ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์†์‹ค ํ•จ์ˆ˜
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)  # SGD ์˜ตํ‹ฐ๋งˆ์ด์ €

Model ํ›ˆ๋ จ, ํ‰๊ฐ€ ํ•จ์ˆ˜ ์ •์˜

ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ํ•จ๊ป˜ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ๋ฐฐ์น˜ ํฌ๊ธฐ, ์—ํฌํฌ ์ˆ˜ ๋“ฑ์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

def train(model, device, train_loader, optimizer, epoch):
    model.train()  # ๋ชจ๋ธ์„ ํ•™์Šต ๋ชจ๋“œ๋กœ ์„ค์ •
    train_loss = 0
    correct = 0
    total = 0
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)  # ๋ฐ์ดํ„ฐ๋ฅผ ์žฅ์น˜๋กœ ์ด๋™
        optimizer.zero_grad()  # ์ด์ „ ๊ธฐ์šธ๊ธฐ ์ดˆ๊ธฐํ™”
        output = model(data)  # ๋ชจ๋ธ ์˜ˆ์ธก
        loss = criterion(output, target)  # ์†์‹ค ๊ณ„์‚ฐ
        loss.backward()  # ์—ญ์ „ํŒŒ๋ฅผ ํ†ตํ•ด ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ
        optimizer.step()  # ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ

        train_loss += loss.item()  # ๋ฐฐ์น˜ ์†์‹ค ํ•ฉ์‚ฐ

        # ํ•™์Šต ์ •ํ™•๋„ ๊ณ„์‚ฐ
        pred = output.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()
        total += target.size(0)

        if batch_idx % 100 == 0:  # 100๋ฒˆ์งธ ๋ฐฐ์น˜๋งˆ๋‹ค ๋กœ๊ทธ ์ถœ๋ ฅ
            print(f'Train Epoch: {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)} '
                  f'({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')

    train_loss /= len(train_loader)  # ํ‰๊ท  ์†์‹ค ๊ณ„์‚ฐ
    train_accuracy = 100. * correct / total
    return train_loss, train_accuracy
# ๋ชจ๋ธ ํ‰๊ฐ€ ํ•จ์ˆ˜ ์ •์˜
def test(model, device, test_loader):
    model.eval()  # ๋ชจ๋ธ์„ ํ‰๊ฐ€ ๋ชจ๋“œ๋กœ ์„ค์ •
    test_loss = 0
    correct = 0
    with torch.no_grad():  # ํ‰๊ฐ€ ์‹œ์—๋Š” ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•˜์ง€ ์•Š์Œ
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()  # ์†์‹ค ํ•ฉ์‚ฐ
            pred = output.argmax(dim=1, keepdim=True)  # ๊ฐ€์žฅ ๋†’์€ ํ™•๋ฅ ์„ ๊ฐ€์ง„ ํด๋ž˜์Šค ์˜ˆ์ธก
            correct += pred.eq(target.view_as(pred)).sum().item()  # ๋งž์ถ˜ ๊ฐœ์ˆ˜ ํ•ฉ์‚ฐ

    test_loss /= len(test_loader.dataset)
    test_accuracy = 100. * correct / len(test_loader.dataset)
    print(f'\nTest set: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(test_loader.dataset)} '
          f'({test_accuracy:.0f}%)\n')
    return test_loss, test_accuracy

 

Epoch (ํ•™์Šต ํšŸ์ˆ˜)๋Š” ๋ง˜๊ฐ™์•„์„  ๋…ผ๋ฌธ์— ๋‚˜์˜จ๊ฒƒ ์ฒ˜๋Ÿผ 50๋ฒˆ์„ ํ•˜๊ณ  ์‹ถ์—ˆ์ง€๋งŒ.. ์‹œ๊ฐ„์ด์Šˆ ๋•Œ๋ฌธ์— 10๋ฒˆ๋งŒ ํ•™์Šต์„ ์‹œ์ผœ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

epochs = 10
train_losses, test_losses, train_accuracies, test_accuracies = [], [], [], []
# ๋ชจ๋ธ ํ•™์Šต
for epoch in range(1, epochs + 1):
    train_loss, train_accuracy = train(model, device, trainloader, optimizer, epoch)
    test_loss, test_accuracy = test(model, device, testloader)
    train_losses.append(train_loss)
    test_losses.append(test_loss)
    train_accuracies.append(train_accuracy)
    test_accuracies.append(test_accuracy)

๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก

๋ชจ๋ธ์˜ ์ „๋ฐ˜์ ์ธ Architecture & ์–ผ๋งˆ๋‚˜ Over, Underfitting์ด ๋˜์—ˆ๋Š”์ง€ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค ํ•œ๋ฒˆ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
from torchsummary import summary
summary(model, input_size=(3, 224, 224))
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
        MaxPool2d-16          [-1, 256, 28, 28]               0
           Conv2d-17          [-1, 512, 28, 28]       1,180,160
             ReLU-18          [-1, 512, 28, 28]               0
           Conv2d-19          [-1, 512, 28, 28]       2,359,808
             ReLU-20          [-1, 512, 28, 28]               0
           Conv2d-21          [-1, 512, 28, 28]       2,359,808
        MaxPool2d-22          [-1, 512, 14, 14]               0
           Conv2d-23          [-1, 512, 14, 14]       2,359,808
             ReLU-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
        MaxPool2d-28            [-1, 512, 7, 7]               0
          Flatten-29                [-1, 25088]               0
           Linear-30                 [-1, 4096]     102,764,544
             ReLU-31                 [-1, 4096]               0
           Linear-32                 [-1, 4096]      16,781,312
             ReLU-33                 [-1, 4096]               0
           Linear-34                   [-1, 10]          40,970
================================================================
Total params: 134,301,514
Trainable params: 134,301,514
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 208.76
Params size (MB): 512.32
Estimated Total Size (MB): 721.65
----------------------------------------------------------------

 

# ์ •ํ™•๋„ ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ
plt.plot(range(1, epochs + 1), train_accuracies, label='Train Accuracy')
plt.plot(range(1, epochs + 1), test_accuracies, label='Test Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Train and Test Accuracy over Epochs')
plt.legend()
plt.show()

์‹คํ–‰ํ–ˆ์„๋•Œ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด, Overfitting์ด ๋‚œ๋‹ค๋Š” ์ ์„ ๋ณผ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค๋งŒ ์ด๋Ÿฌํ•œ ์ ์€ Batchsize๋ฅผ 64, epoch๋ฅผ 10์ •๋„ ์ฃผ์–ด, ๋„ˆ๋ฌด ๋งŽ์ด ํ•™์Šต์ด ๋˜์–ด์„œ ๊ทธ๋ ‡๊ฒŒ ๋ณผ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค..