A A
[DL] Convolution & Pooling Layer ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
์ด๋ฒˆ์—๋Š” Convolution Layer, Pooling Layer๋ฅผ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

Convolution & Pooling Layer ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ

4-Dimension Array (4์ฐจ์› ๋ฐฐ์—ด)

Convolution Neural Network(CNN)์—์„œ Layer ์‚ฌ์ด๋ฅผ ํ๋ฅด๋Š” ๋ฐ์ดํ„ฐ๋Š” 4์ฐจ์›์ž…๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด์„œ ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ์ด (10, 1, 28, 28)์ด๋ฉด?
  • Height(๋†’์ด): 28, Width(๋„ˆ๋น„): 28, Channel(์ฑ„๋„): 1๊ฐœ์ธ ๋ฐ์ดํ„ฐ๊ฐ€ 10๊ฐœ๋ผ๋Š” ์ด์•ผ๊ธฐ ์ž…๋‹ˆ๋‹ค.
  • ์ด๋ฅผ Python์œผ๋กœ ๊ตฌํ˜„ํ•˜๋ฉด ์•„๋ž˜์˜ ์ฝ”๋“œ์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
x = np.random.rand(10, 1, 28, 28) # ๋ฌด์ž‘์œ„๋กœ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
x[0, 0] # ๋˜๋Š” x[0][0] ์ฒซ๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ ์ฒซ ์ฑ„๋„ ๊ณต๊ฐ„ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผ
  • ์—ฌ๊ธฐ์—์„œ 10๊ฐœ์˜ ๋ฐ์ดํ„ฐ์ค‘ ์ฒซ ๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผํ•˜๋ ค๋ฉด? ๋‹จ์ˆœํžˆ x[0]์ด๋ผ๊ณ  ์”๋‹ˆ๋‹ค.
  • Python์˜ index๋Š” 0๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋‘ ๋ฒˆ์งธ ๋ฐ์ดํ„ฐ๋Š” x[1] ์œ„์น˜์— ์žˆ์Šต๋‹ˆ๋‹ค.
x[0].shape # (1, 28, 28)
x[1].shape # (1, 28, 28)
  • ๋˜ํ•œ ์ฒซ๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ ์ฒซ ์ฑ„๋„์˜ ๊ณต๊ฐ„ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผํ•˜๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ ์Šต๋‹ˆ๋‹ค.
x[0, 0] # or x[0][0]

 

 

im2col๋กœ ๋ฐ์ดํ„ฐ ์ „๊ฐœํ•˜๊ธฐ

im2col์€ Input Data(์ž…๋ ฅ ๋ฐ์ดํ„ฐ)๋ฅผ filtering, ์ฆ‰ Weight(๊ฐ€์ค‘์น˜)๊ณ„์‚ฐ์„ ํ•˜๊ธฐ ์ข‹๊ฒŒ ์ „๊ฐœํ•˜๋Š”(ํŽผ์น˜๋Š”) ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด 3-Dimension Input Data์— im2col์„ ์ ์šฉํ•˜๋ฉด 2-Dimenion Array๋กœ ๋ด๋€๋‹ˆ๋‹ค.
    • ์ž์„ธํžˆ๋Š” Batch ์•ˆ์— Data ์ˆ˜๊นŒ์ง€ ํฌํ•จํ•œ 4-Dimension Data๋ฅผ 2-Dimension์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

๋Œ€๋žต์ ์ธ im2col์˜ ๋™์ž‘

  • im2col์€ filtering ํ•˜๊ธฐ ์ข‹๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๊ฐœํ•ฉ๋‹ˆ๋‹ค.
  • ๊ตฌ์ฒด์ ์œผ๋กœ๋Š” ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์—์„œ filter๋ฅผ ์ ์šฉํ•˜๋Š” ์˜์—ญ(3-Dimension Block)์„ ํ•œ ์ค„๋กœ ๋Š˜์—ฌ๋†“์Šต๋‹ˆ๋‹ค.
  • ์ด ์ „๊ฐœ์—์„œ filter๋ฅผ ์ ์šฉํ•˜๋Š” ๋ชจ๋“  ์˜์—ญ์—์„œ ์ˆ˜ํ–‰ํ•˜๋Š”๊ฒŒ im2col ์ž…๋‹ˆ๋‹ค.

ํ•„ํ„ฐ ์ ์šฉ ์˜์—ญ์„ ์•ž์œผ๋กœ๋ถ€ํ„ฐ ์ˆœ์„œ๋Œ€๋กœ 1์ค„๋กœ ํŽผ์นœ๋‹ค.

  • im2col๋กœ Input Data๋ฅผ ์ „๊ฐœํ•œ ๋‹ค์Œ์—๋Š” Convolution Layer์˜ filter(weight)๋ฅผ 1์—ด๋กœ ์ „๊ฐœํ•˜๊ณ , ๋‘ Array์˜ ๊ณฑ์„ ๊ณ„์‚ฐํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
  • ์ด๋Š” Fully-Connected Layer (FC)์˜ Affine ๊ณ„์ธต์—์„œ ํ•œ ๊ฒƒ๊ณผ ๊ฑฐ์ด ๊ฐ™์Šต๋‹ˆ๋‹ค.

Convolution ์—ฐ์‚ฐ์˜ filter ์ฒ˜๋ฆฌ ์ƒ์„ธ ๊ณผ์ •์ž…๋‹ˆ๋‹ค.

  • ์œ„์˜ ๊ทธ๋ฆผ์€ filter๋ฅผ ์„ธ๋กœ๋กœ 1์—ด๋กœ ์ „๊ฐœํ•˜๊ณ , im2col์ด ์ „๊ฐœํ•œ ๋ฐ์ดํ„ฐ์™€ ํ–‰๋ ฌ๊ณฑ์„ ๊ณ„์‚ฐํ›„, ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ˜•(Reshape) ํ•ฉ๋‹ˆ๋‹ค.

 

Convoultional Layer (ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต) ๊ตฌํ˜„ํ•˜๊ธฐ

ํ•œ๋ฒˆ Convoultional Layer (ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต)์„ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • im2col ํ•จ์ˆ˜์˜ Interface๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
im2col(input_data, filter_h, filter_w, stride=1, pad=0)
  • input_data: (๋ฐ์ดํ„ฐ ์ˆ˜, ์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„)์˜ 4์ฐจ์› ๋ฐฐ์—ด๋กœ ์ด๋ค„์ง„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ
  • filter_h: ํ•„ํ„ฐ์˜ ๋†’์ด
  • filter_w: ํ•„ํ„ฐ์˜ ๋„ˆ๋น„
  • stride: ์ŠคํŠธ๋ผ์ด๋“œ
  • pad: ํŒจ๋”ฉ

 

  • im2col์€ 'ํ•„ํ„ฐ ํฌ๊ธฐ, 'Stride', 'padding'์„ ๊ณ ๋ คํ•˜์—ฌ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ 2-Dimension์œผ๋กœ ์ „๊ฐœํ•ฉ๋‹ˆ๋‹ค.
import sys, os
sys.path.append(os.pardir)
from common.util import im2col

x1 = np.random.rand(1, 3, 7, 7) # (๋ฐ์ดํ„ฐ ์ˆ˜, ์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„)
col1 = im2col(x1, 5, 5, stride=1, pad=0)
print(col1.shape) # (9, 75)

x2 = np.random.rand(10, 3, 7, 7) # ๋ฐ์ดํ„ฐ 10๊ฐœ
col2 = im2col(x2, 5, 5, stride=1, pad=0)
print(col2.shape) # (90, 75)
  • ์—ฌ๊ธฐ์„œ๋Š” 2๊ฐ€์ง€์˜ ์˜ˆ์‹œ๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํ•˜๋‚˜(x1)๋Š” Batch_size๊ฐ€ 1(๋ฐ์ดํ„ฐ 1๊ฐœ), Channel 3๊ฐœ, ๋†’์ด * ๋„ˆ๋น„๊ฐ€ 7 x 7์˜ ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค.
  • ๋‹ค๋ฅธ ํ•˜๋‚˜(x2)๋Š” Batch_size๊ฐ€ 10(๋ฐ์ดํ„ฐ 10๊ฐœ)์ด๊ณ , ๋‚˜๋จธ์ง€๋Š” ์ฒซ๋ฒˆ์งธ(x1)๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • im2col ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ ๋‘ ๊ฒฝ์šฐ ๋ชจ๋‘ 2-Dimension์˜ ์›์†Œ๋Š” 75์ž…๋‹ˆ๋‹ค. ์ด ๊ฐ’์€ filter์˜ ์›์†Œ ์ˆ˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. (Channel 3๊ฐœ, 5x5 ๋ฐ์ดํ„ฐ)
  • ๋˜ํ•œ Batch_size๊ฐ€ 1์ผ๋•Œ im2col์˜ ๊ฒฐ๊ณผ์˜ ํฌ๊ธฐ๊ฐ€ (9, 75)์ด๊ณ , 10์ผ ๋•Œ์—๋Š” (90, 75) ํฌ๊ธฐ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด ํ•œ๋ฒˆ im2col์„ ์‚ฌ์šฉํ•˜์—ฌ Convolutional Layer๋ฅผ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
class Convolution:
    def __init__(self, W, b, stride=1, pad=0):
        # ์ดˆ๊ธฐํ™” ๋ฉ”์†Œ๋“œ
        self.W = W  # W๋Š” ํ•„ํ„ฐ์˜ ๊ฐ€์ค‘์น˜, 4์ฐจ์› ๋ฐฐ์—ด: (ํ•„ํ„ฐ ๊ฐœ์ˆ˜, ์ฑ„๋„ ์ˆ˜, ํ•„ํ„ฐ ๋†’์ด, ํ•„ํ„ฐ ๋„ˆ๋น„)
        self.b = b  # b๋Š” ํ•„ํ„ฐ์˜ ํŽธํ–ฅ, 1์ฐจ์› ๋ฐฐ์—ด: (ํ•„ํ„ฐ ๊ฐœ์ˆ˜,)
        self.stride = stride  # stride๋Š” ํ•„ํ„ฐ๋ฅผ ์ ์šฉํ•˜๋Š” ๊ฐ„๊ฒฉ
        self.pad = pad  # pad๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ฃผ๋ณ€์„ ๋ง๋Œ„ 0์˜ ๊ฐœ์ˆ˜

    def forward(self, x):
        # ์ˆœ์ „ํŒŒ ๋ฉ”์†Œ๋“œ
        FN, C, FH, FW = self.W.shape  # FN: ํ•„ํ„ฐ ๊ฐœ์ˆ˜, C: ์ฑ„๋„ ์ˆ˜, FH: ํ•„ํ„ฐ ๋†’์ด, FW: ํ•„ํ„ฐ ๋„ˆ๋น„
        N, C, H, W = x.shape  # N: ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜, C: ์ฑ„๋„ ์ˆ˜, H: ๋†’์ด, W: ๋„ˆ๋น„
        out_h = int(1 + (H + 2*self.pad - FH) / self.stride)  # ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋†’์ด ๊ณ„์‚ฐ
        out_w = int(1 + (W + 2*self.pad - FW) / self.stride)  # ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋„ˆ๋น„ ๊ณ„์‚ฐ
        
        # ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ํ•„ํ„ฐ๋ฅผ 2์ฐจ์› ๋ฐฐ์—ด๋กœ ์ „๊ฐœํ•˜๊ณ  ๋‚ด์ ํ•จ
        col = im2col(x, FH, FW, self.stride, self.pad)  # im2col ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„ํ„ฐ๋งํ•˜๊ธฐ ์ข‹์€ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜
        col_W = self.W.reshape(FN, -1).T  # ํ•„ํ„ฐ์˜ ๊ฐ€์ค‘์น˜๋ฅผ 2์ฐจ์› ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ ํ›„ ์ „์น˜
        out = np.dot(col, col_W) + self.b  # ๋ณ€ํ™˜๋œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ํ•„ํ„ฐ ๊ฐ€์ค‘์น˜์˜ ๋‚ด์ ์„ ๊ณ„์‚ฐํ•˜๊ณ  ํŽธํ–ฅ์„ ๋”ํ•จ
        
        # ๊ฒฐ๊ณผ๋ฅผ ์ ์ ˆํžˆ ๋ณ€ํ˜•ํ•˜์—ฌ ์ถœ๋ ฅ ํ˜•ํƒœ๋กœ ์กฐ์ •
        # reshape์—์„œ -1์€ ์›์†Œ์˜ ๊ฐœ์ˆ˜์— ๋งž์ถฐ ์ž๋™์œผ๋กœ ํฌ๊ธฐ๋ฅผ ์„ค์ •
        out = out.reshape(N, out_h, out_w, -1).transpose(0, 3, 1, 2)
        # transpose๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ•์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝ (๋ฐฐ์น˜ ํฌ๊ธฐ, ํ•„ํ„ฐ ๊ฐœ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„ ์ˆœ์œผ๋กœ ์กฐ์ •)
        
        return out
  • Convolutional Layer๋Š” filter(Weight), Bias(ํŽธํ–ฅ), Stride, Padding์„ ์ธ์ˆ˜๋กœ ๋ฐ›์•„์„œ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • filter๋Š” (FN, C, FH, FW)์˜ 4-Dimension์˜ ํ˜•์ƒ์ž…๋‹ˆ๋‹ค. FN์€ filter ๊ฐœ์ˆ˜, C๋Š” Channel, FH๋Š” ํ•„ํ„ฐ ๋†’์ด, FW๋Š” ํ•„ํ„ฐ ๋„ˆ๋น„ ์ž…๋‹ˆ๋‹ค.

 

Pooling Layer ๊ตฌํ˜„ํ•˜๊ธฐ

Pooling Layer ๊ตฌํ˜„๋„ Convolutional Layer์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ im2col์„ ์‚ฌ์šฉํ•ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๊ฐœํ•ฉ๋‹ˆ๋‹ค.
  • ๋‹จ, Pooling์˜ ๊ฒฝ์šฐ์—” Channel ์ชฝ์ด ๋…๋ฆฝ์ ์ด๋ผ๋Š” ์ ์ด Convolutional Layer๋•Œ์™€๋Š” ๋‹ค๋ฆ…๋‹ˆ๋‹ค.
  • ๊ตฌ์ฒด์ ์œผ๋กœ๋Š” ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด Pooling ์ ์šฉ ์˜์—ญ์„ ์ฑ„๋„๋งˆ๋‹ค ๋…๋ฆฝ์ ์œผ๋กœ ์ „๊ฐœํ•ฉ๋‹ˆ๋‹ค.

Input data์— Pooling ์ ์šฉ ์˜์—ญ์„ ์ „๊ฐœ (2x2 Pooling์˜ ์˜ˆ)

  • ์ด๋ ‡๊ฒŒ ์ „๊ฐœ๋ฅผ ํ•˜๊ณ , ์ „๊ฐœํ•œ ํ–‰๋ ฌ์—์„œ ํ–‰ ๋ณ„ ์ตœ๋Œ€๊ฐ’์„ ๊ตฌํ•˜๊ณ  ์ ์ ˆํ•œ ํ˜•์ƒ์œผ๋กœ ์„ฑํ˜•ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

Pooling Layer ๊ตฌํ˜„์˜ ํ๋ฆ„, Pooling ์ ์šฉ ์˜์—ญ์—์„œ ๊ฐ€์žฅ ํฐ ์›์†Œ๋Š” ํšŒ์ƒ‰์œผ๋กœ ํ‘œ์‹œ

  • ์ด๊ฒƒ์ด Pooling Layer์˜ forward ์ฒ˜๋ฆฌ ํ๋ฆ„์ž…๋‹ˆ๋‹ค. ์ด๊ฑธ Python Code๋กœ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
class Pooling:
    def __init__(self, pool_h, pool_w, stride=1, pad=0):
        # ์ดˆ๊ธฐํ™” ๋ฉ”์†Œ๋“œ
        self.pool_h = pool_h  # ํ’€๋ง ์œˆ๋„์šฐ์˜ ๋†’์ด
        self.pool_w = pool_w  # ํ’€๋ง ์œˆ๋„์šฐ์˜ ๋„ˆ๋น„
        self.stride = stride  # ํ’€๋ง์„ ์ ์šฉํ•˜๋Š” ๊ฐ„๊ฒฉ
        self.pad = pad        # ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ฃผ๋ณ€์„ ๋ง๋Œ„ 0์˜ ๊ฐœ์ˆ˜

    def forward(self, x):
        # ์ˆœ์ „ํŒŒ ๋ฉ”์†Œ๋“œ
        N, C, H, W = x.shape  # N: ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜, C: ์ฑ„๋„ ์ˆ˜, H: ๋†’์ด, W: ๋„ˆ๋น„
        out_h = int(1 + (H - self.pool_h) / self.stride)  # ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋†’์ด ๊ณ„์‚ฐ
        out_w = int(1 + (W - self.pool_w) / self.stride)  # ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋„ˆ๋น„ ๊ณ„์‚ฐ
        
        # ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ’€๋ง์— ์ ํ•ฉํ•œ ํ˜•ํƒœ๋กœ ์ „๊ฐœ
        col = im2col(x, self.pool_h, self.pool_w, self.stride, self.pad)  # ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„ํ„ฐ๋งํ•˜๊ธฐ ์ข‹์€ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜
        col = col.reshape(-1, self.pool_h * self.pool_w)  # ๊ฐ ํ’€๋ง ์˜์—ญ์„ ํ–‰์œผ๋กœ ๋ณ€ํ˜•

        # ์ตœ๋Œ“๊ฐ’ ์—ฐ์‚ฐ
        out = np.max(col, axis=1)  # ๊ฐ ํ’€๋ง ์œˆ๋„์šฐ ๋‚ด์˜ ์ตœ๋Œ€๊ฐ’์„ ์ฐพ์Œ

        # ๊ฒฐ๊ณผ๋ฅผ ์ ์ ˆํžˆ ๋ณ€ํ˜•ํ•˜์—ฌ ์ถœ๋ ฅ ํ˜•ํƒœ๋กœ ์กฐ์ •
        out = out.reshape(N, out_h, out_w, C).transpose(0, 3, 1, 2)
        # transpose๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ•์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝ (๋ฐฐ์น˜ ํฌ๊ธฐ, ์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„ ์ˆœ์œผ๋กœ ์กฐ์ •)

        return out
  • Pooling Layer ๊ตฌํ˜„์€ 3๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.
  1. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๊ฐœํ•œ๋‹ค.
  2. ํ–‰๋ ฌ ์ตœ๋Œ€๊ฐ’์„ ๊ตฌํ•œ๋‹ค.
  3. ์ ์ ˆํ•œ ๋ชจ์–‘์œผ๋กœ ์„ฑํ˜•ํ•œ๋‹ค.
  • ์ด๋Ÿฌํ•œ ๊ณผ์ •์œผ๋กœ Pooling Layer์˜ forward ์ฒ˜๋ฆฌ๊ฐ€ ๊ตฌํ˜„๋ฉ๋‹ˆ๋‹ค.

Convolution Neural Network (CNN) ๊ตฌํ˜„ํ•˜๊ธฐ

Convolutional Layer & Pooling Layer๋ฅผ ๊ตฌํ˜„ํ–ˆ์œผ๋‹ˆ, ์ด Layer๋“ค์„ ์กฐํ•ฉํ•˜์—ฌ ์†๊ธ€์”จ ์ˆซ์ž๋ฅผ ์ธ์‹ํ•˜๋Š” CNN์„ ํ•œ๋ฒˆ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์ด๋ฒˆ์—๋Š” ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋กœ CNN์„ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‹จ์ˆœํ•œ CNN ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์„ฑ

  • ์ด CNN Network๋Š” "Convoultion-ReLU-Pooling-Affine-ReLU-Affine-Softmax" ์ˆœ์œผ๋กœ ํ๋ฆ…๋‹ˆ๋‹ค.
  • ์ดˆ๊ธฐํ™”๋•Œ์˜ ์ธ์ˆ˜๋Š” ๋‹ค์Œ ์ธ์ˆ˜๋“ค์„ ๋ฐ›์Šต๋‹ˆ๋‹ค.

 

  • input_dim: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ(์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„)์˜ ์ฐจ์›
  • conv_param: ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ(๋”•์…”๋„ˆ๋ฆฌ)
    • filter_num: ํ•„ํ„ฐ ์ˆ˜
      • filter_size: ํ•„ํ„ฐ ํฌ๊ธฐ
      • stride: ์ŠคํŠธ๋ผ์ด๋“œ
      • pad: ํŒจ๋”ฉ
  • hidden_size: ์€๋‹‰์ธต(์™„์ „์—ฐ๊ฒฐ)์˜ ๋‰ด๋Ÿฐ ์ˆ˜
  • output_size: ์ถœ๋ ฅ์ธต(์™„์ „์—ฐ๊ฒฐ)์˜ ๋‰ด๋Ÿฐ ์ˆ˜
  • weight_init_std: ์ดˆ๊ธฐํ™” ๋•Œ์˜ ๊ฐ€์ค‘์น˜ ํ‘œ์ค€ํŽธ์ฐจ

 

์—ฌ๊ธฐ์„œ Convolutional Layer์˜ Hyperparameter๋Š” Dictionary ํ˜•ํƒœ๋กœ ์ฃผ์–ด์ง‘๋‹ˆ๋‹ค (conv_param).
  • ์ด๊ฒƒ์€ ํ•„์š”ํ•œ Hyperparameter์˜ ๊ฐ’์ด ['filter_num':30, 'filter_size':5, 'pad':0, 'stride': 1] ์ฒ˜๋Ÿผ ์ €์žฅ๋œ๋‹ค๋Š” ๋œป์ž…๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ ์ฝ”๋“œ๋ฅผ ์„ค๋ช…๋“œ๋ฆด๊ฑด๋ฐ, ์ข€ ๊ธธ์ด๊ฐ€ ๊ธธ์–ด์„œ 3๋ถ€๋ถ„์œผ๋กœ ๋‚˜๋ˆ ์„œ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
class SimpleConvNet:
    def __init__(self, input_dim=(1, 28, 28), 
                 conv_param={'filter_num': 30, 'filter_size': 5, 'pad': 0, 'stride': 1},
                 hidden_size=100, output_size=10, weight_init_std=0.01):
        # ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ํŒŒ๋ผ๋ฏธํ„ฐ
        filter_num = conv_param['filter_num']  # ํ•„ํ„ฐ์˜ ์ˆ˜
        filter_size = conv_param['filter_size']  # ๊ฐ ํ•„ํ„ฐ์˜ ํฌ๊ธฐ
        filter_pad = conv_param['pad']  # ์ด๋ฏธ์ง€ ์ฃผ๋ณ€์˜ ํŒจ๋”ฉ
        filter_stride = conv_param['stride']  # ํ•„ํ„ฐ์˜ ์ŠคํŠธ๋ผ์ด๋“œ
        input_size = input_dim[1]  # ์ž…๋ ฅ ์ด๋ฏธ์ง€ ํฌ๊ธฐ (์ •์‚ฌ๊ฐํ˜•์ด๋ผ๊ณ  ๊ฐ€์ •)

        # ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ์ถœ๋ ฅ ํฌ๊ธฐ๋ฅผ ๊ณ„์‚ฐ
        conv_output_size = (input_size - filter_size + 2 * filter_pad) / filter_stride + 1
        conv_output_size = int(conv_output_size)
        
        # ํ’€๋ง ๊ณ„์ธต์„ ๊ฐ€์ •ํ•˜์—ฌ 2x2 ํ’€๋ง ํฌ๊ธฐ์™€ ์ŠคํŠธ๋ผ์ด๋“œ 2๋ฅผ ์‚ฌ์šฉ
        # ์ด ๊ฒฝ์šฐ ๊ฐ ์ฐจ์›์„ ์ ˆ๋ฐ˜์œผ๋กœ ์ค„์ž„
        pool_output_size = int(filter_num * (conv_output_size / 2) * (conv_output_size / 2))
        
        # ๋„คํŠธ์›Œํฌ ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
        self.params = {}
  • ์ดˆ๊ธฐํ™” ์ธ์ˆ˜(_init_)์œผ๋กœ ์ฃผ์–ด์ง„ Convolutional Layer์˜ Parameter๋ฅผ Dictionary์—์„œ ๊บผ๋‚ด๊ณ , ์ถœ๋ ฅ ํฌ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

 

self.params = {}
# ์ฒซ ๋ฒˆ์งธ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
self.params['W1'] = weight_init_std * np.random.randn(filter_num, input_dim[0], filter_size, filter_size)
# filter_num: ํ•„ํ„ฐ์˜ ๊ฐœ์ˆ˜, input_dim[0]: ์ž…๋ ฅ ์ฑ„๋„ ์ˆ˜ (์˜ˆ: ํ‘๋ฐฑ ์ด๋ฏธ์ง€๋Š” 1, ์ปฌ๋Ÿฌ๋Š” 3)
# filter_size: ํ•„ํ„ฐ์˜ ๋†’์ด์™€ ๋„ˆ๋น„
# np.random.randn์€ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๋Š” ๋‚œ์ˆ˜๋ฅผ ์ƒ์„ฑ, weight_init_std๋Š” ์ด ๋‚œ์ˆ˜์˜ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ์กฐ์ ˆ

self.params['b1'] = np.zeros(filter_num)
# ์ฒซ ๋ฒˆ์งธ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ํŽธํ–ฅ ์ดˆ๊ธฐํ™”
# ํ•„ํ„ฐ๋งˆ๋‹ค ํ•˜๋‚˜์˜ ํŽธํ–ฅ ๊ฐ’์„ ๊ฐ–์œผ๋ฉฐ, ๋ชจ๋“  ํŽธํ–ฅ์„ 0์œผ๋กœ ์ดˆ๊ธฐํ™”

# ๋‘ ๋ฒˆ์งธ ๊ณ„์ธต (์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต)์˜ ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
self.params['W2'] = weight_init_std * np.random.randn(pool_output_size, hidden_size)
# pool_output_size: ํ’€๋ง ๊ณ„์ธต ์ถœ๋ ฅ์˜ ํฌ๊ธฐ (ํ•„ํ„ฐ ์ˆ˜ * ๊ฐ์†Œ๋œ ๋†’์ด * ๊ฐ์†Œ๋œ ๋„ˆ๋น„)
# hidden_size: ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜

self.params['b2'] = np.zeros(hidden_size)
# ๋‘ ๋ฒˆ์งธ ๊ณ„์ธต์˜ ํŽธํ–ฅ ์ดˆ๊ธฐํ™”
# ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜๋งŒํผ ํŽธํ–ฅ์„ 0์œผ๋กœ ์ดˆ๊ธฐํ™”

# ์„ธ ๋ฒˆ์งธ ๊ณ„์ธต (์ถœ๋ ฅ ๊ณ„์ธต)์˜ ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
self.params['W3'] = weight_init_std * np.random.randn(hidden_size, output_size)
# hidden_size: ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜
# output_size: ์ถœ๋ ฅ์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜, ์ฆ‰ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜๋Š” ํด๋ž˜์Šค ์ˆ˜

self.params['b3'] = np.zeros(output_size)
# ์„ธ ๋ฒˆ์งธ ๊ณ„์ธต์˜ ํŽธํ–ฅ ์ดˆ๊ธฐํ™”
# ์ถœ๋ ฅ์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜๋งŒํผ ํŽธํ–ฅ์„ 0์œผ๋กœ ์ดˆ๊ธฐํ™”
  • ์œ„์˜ ์ฝ”๋“œ๋Š” Weight Parameter(๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜)๋ฅผ ์ดˆ๊ธฐํ™” ํ•˜๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

 

self.layers = OrderedDict()
# ์ˆœ์„œ๊ฐ€ ์ค‘์š”ํ•œ ๊ณ„์ธต๋“ค์„ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด OrderedDict์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์ธต๋“ค์„ ์ €์žฅ

# ์ฒซ ๋ฒˆ์งธ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต
self.layers['Conv1'] = Convolution(self.params['W1'], self.params['b1'],
                                   conv_param['stride'], conv_param['pad'])
# Convolution ํด๋ž˜์Šค์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑ, ํ•„ํ„ฐ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ, ์ŠคํŠธ๋ผ์ด๋“œ, ํŒจ๋”ฉ ์ •๋ณด๋ฅผ ์ „๋‹ฌ

# ์ฒซ ๋ฒˆ์งธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜: ReLU
self.layers['Relu1'] = Relu()
# ReLU(Rectified Linear Unit) ํ™œ์„ฑํ™” ํ•จ์ˆ˜, ์Œ์ˆ˜๋ฅผ 0์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์—ฌ ๋น„์„ ํ˜•์„ฑ ์ถ”๊ฐ€

# ์ฒซ ๋ฒˆ์งธ ํ’€๋ง ๊ณ„์ธต
self.layers['Pool1'] = Pooling(pool_h=2, pool_w=2, stride=2)
# Pooling ํด๋ž˜์Šค์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑ, 2x2 ํฌ๊ธฐ์˜ ํ’€๋ง ์œˆ๋„์šฐ์™€ ์ŠคํŠธ๋ผ์ด๋“œ 2๋ฅผ ์„ค์ •

# ์ฒซ ๋ฒˆ์งธ ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต
self.layers['Affine1'] = Affine(self.params['W2'], self.params['b2'])
# Affine ๊ณ„์ธต (๋˜๋Š” ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต), ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์„ ์ „๋‹ฌ

# ๋‘ ๋ฒˆ์งธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜: ReLU
self.layers['Relu2'] = Relu()
# ๋‘ ๋ฒˆ์งธ ReLU ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ธ์Šคํ„ด์Šค

# ๋‘ ๋ฒˆ์งธ ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต
self.layers['Affine2'] = Affine(self.params['W3'], self.params['b3'])
# ๋‘ ๋ฒˆ์งธ Affine ๊ณ„์ธต, ์ถœ๋ ฅ์ธต์œผ๋กœ ์—ฐ๊ฒฐ๋˜๊ธฐ ์ „์˜ ๋งˆ์ง€๋ง‰ ์€๋‹‰์ธต

# ์†์‹ค ๊ณ„์ธต: Softmax-with-Loss
self.last_layer = SoftmaxWithLoss()
# SoftmaxWithLoss ํด๋ž˜์Šค์˜ ์ธ์Šคํ„ด์Šค, ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ์ธต์˜ ์†์‹ค ํ•จ์ˆ˜ ๋ฐ ์ถœ๋ ฅ ํ™•๋ฅ ์„ ๊ณ„์‚ฐ
  • ์ˆœ์„œ๊ฐ€ ์žˆ๋Š” Dictionary (OrderedDict)์ธ Layer์— ๊ณ„์ธต๋“ค์„ ์ฐจ๋ก€๋กœ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • ๋งˆ์ง€๋ง‰์œผ๋กœ SoftmaxWithLoss ๊ณ„์ธต ๋งŒํผ last_alyer๋ผ๋Š” ๋ณ„๋„ ๋ณ€์ˆ˜์— ์ €์žฅํ•ด๋‘ก๋‹ˆ๋‹ค.

 

  • ์ด๋ ‡๊ฒŒ SimpleConvNet์„ ์ดˆ๊ธฐํ™”๋ฅผ ํ•˜๊ณ , ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” Predict, Loss ํ•จ์ˆ˜์˜ ๊ฐ’์„ ๊ตฌํ•˜๋Š” Loss Method๋ฅผ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
    def predict(self, x):
        for layer in self.layers.values():
            x = layer.forward(x)
        return x

    def loss(self, x, t):
        # x: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ
        # t: ์ •๋‹ต ๋ ˆ์ด๋ธ”
        y = self.predict(x)
        return self.last_layer.forward(y, t)

 

  • ๊ทธ๋ฆฌ๊ณ  Backpropagation(์˜ค์ฐจ์—ญ์ „ํŒŒ๋ฒ•)์œผ๋กœ Gradient(๊ธฐ์šธ๊ธฐ)๋ฅผ ๊ตฌํ•˜๋Š” ๊ตฌํ˜„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
def gradient(self, x, t):
    """
    ๊ฐ ์ธต์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ๋‹ด์€ ์‚ฌ์ „(dictionary) ๋ณ€์ˆ˜
    grads['W1'], grads['W2'], ... ๊ฐ ์ธต์˜ ๊ฐ€์ค‘์น˜
    grads['b1'], grads['b2'], ... ๊ฐ ์ธต์˜ ํŽธํ–ฅ
    """
    # ์ˆœ์ „ํŒŒ
    self.loss(x, t)
    # loss ๋ฉ”์†Œ๋“œ๋ฅผ ํ†ตํ•ด ์ˆœ์ „ํŒŒ๋ฅผ ์ง„ํ–‰ํ•˜๊ณ , ์†์‹ค์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์—ญ์ „ํŒŒ์˜ ์‹œ์ž‘์ ์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

    # ์—ญ์ „ํŒŒ
    dout = 1
    dout = self.last_layer.backward(dout)
    # ๋งˆ์ง€๋ง‰ ์†์‹ค ๊ณ„์ธต์—์„œ ์‹œ์ž‘ํ•˜์—ฌ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด ์ดˆ๊ธฐ ๊ธฐ์šธ๊ธฐ๋Š” 1๋กœ ์„ค์ •๋ฉ๋‹ˆ๋‹ค.

    layers = list(self.layers.values())
    layers.reverse()
    # ๋„คํŠธ์›Œํฌ์˜ ๊ณ„์ธต์„ ์—ญ์ˆœ์œผ๋กœ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค. ์—ญ์ „ํŒŒ๋Š” ์ถœ๋ ฅ์ธต์—์„œ ์ž…๋ ฅ์ธต ์ˆœ์œผ๋กœ ์ง„ํ–‰๋˜์–ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

    for layer in layers:
        dout = layer.backward(dout)
    # ๊ฐ ๊ณ„์ธต์— ๋Œ€ํ•ด ์—ญ์ „ํŒŒ๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ ๊ฐ ๊ณ„์ธต์˜ ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•œ ๊ธฐ์šธ๊ธฐ๊ฐ€ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

    # ๊ฒฐ๊ณผ ์ €์žฅ
    grads = {}
    grads['W1'], grads['b1'] = self.layers['Conv1'].dW, self.layers['Conv1'].db
    grads['W2'], grads['b2'] = self.layers['Affine1'].dW, self.layers['Affine1'].db
    grads['W3'], grads['b3'] = self.layers['Affine2'].dW, self.layers['Affine2'].db
    # ๊ณ„์‚ฐ๋œ ๊ธฐ์šธ๊ธฐ๋ฅผ grads ์‚ฌ์ „์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๊ณ„์ธต์˜ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์— ๋Œ€ํ•œ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ฐ๊ฐ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

    return grads
  • Backpropagation(์˜ค์ฐจ์—ญ์ „ํŒŒ๋ฒ•)์œผ๋กœ Gradient(๊ธฐ์šธ๊ธฐ)๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •์—์„œ, Forward Propagation(์ˆœ์ „ํŒŒ) & Back Propagation(์—ญ์ „ํŒŒ)๋ฅผ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค.
  • ๋งˆ์ง€๋ง‰์œผ๋กœ grads๋ผ๋Š” Dictionary ๋ณ€์ˆ˜์— ๊ฐ Weight Parameter์˜ Gradient๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๊ฒƒ์ด SimpleConvNet์˜ ๊ตฌํ˜„์ž…๋‹ˆ๋‹ค.

CNN ์‹œ๊ฐํ™”ํ•˜๊ธฐ

ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์„ ์‹œ๊ฐํ™” ํ•ด์„œ CNN์ด ๋ฌด์—‡์„ ๋ณด๊ณ  ์žˆ๋Š”๊ฒƒ์ด ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์ด ์ฝ”๋“œ๋Š” ํ•™์Šต ์ „๊ณผ ํ›„์˜ Weight(๊ฐ€์ค‘์น˜)๋ฅผ ๋น„๊ตํ•ด ๋ณด๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
  • ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์˜ ์‚ฌ์ง„์— ๋‚˜์˜ต๋‹ˆ๋‹ค.
# coding: utf-8
import numpy as np
import matplotlib.pyplot as plt
from simple_convnet import SimpleConvNet

def filter_show(filters, nx=8, margin=3, scale=10):
    """
    c.f. https://gist.github.com/aidiary/07d530d5e08011832b12#file-draw_weight-py
    """
    FN, C, FH, FW = filters.shape
    ny = int(np.ceil(FN / nx))

    fig = plt.figure()
    fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)

    for i in range(FN):
        ax = fig.add_subplot(ny, nx, i+1, xticks=[], yticks=[])
        ax.imshow(filters[i, 0], cmap=plt.cm.gray_r, interpolation='nearest')
    plt.show()


network = SimpleConvNet()
# ๋ฌด์ž‘์œ„(๋žœ๋ค) ์ดˆ๊ธฐํ™” ํ›„์˜ ๊ฐ€์ค‘์น˜
filter_show(network.params['W1'])

# ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜
network.load_params("params.pkl")
filter_show(network.params['W1'])

ํ•™์Šต ์ „๊ณผ ํ›„์˜ 1๋ฒˆ์จฐ Layer์˜ ํ•ฉ์„ฑ๊ณฑ Layer์˜ Weight(๊ฐ€์ค‘์น˜)

  • Weight์˜ ์›์†Œ๋Š” ์‹ค์ˆ˜์ด์ง€๋งŒ, ์ด๋ฏธ์ง€์—์„œ๋Š” ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์€ ๊ฒ€์€์ƒ‰(0), ๊ฐ€์žฅํฐ ๊ฐ’์€(255)์€ ํฐ์ƒ‰์œผ๋กœ ์ •๊ทœํ™” ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•™์Šต ์ „ ํ•„ํ„ฐ๋Š” ๋ฌด์ž‘์œ„๋กœ ๊ทœ์น™์„ฑ์ด ์—†์ง€๋งŒ ํ•™์Šต์„ ๋งˆ์นœ ํ•„ํ„ฐ๋Š” ์ค„๋ฌด๋Šฌ, ๋ฉ์–ด๋ฆฌ ๋“ฑ ๊ทœ์น™์„ ๋•๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ ํ•„ํ„ฐ๋Š” ์—์ง€(์ƒ‰์ƒ์ด ๋ฐ”๋€ ๊ฒฝ๊ณ„), ๋ธ”๋กญblob(๊ตญ์†Œ์ ์œผ๋กœ ๋ฉ์–ด๋ฆฌ์ง„ ์˜์—ญ) ๋“ฑ์„ ์ธ์‹ํ•ฉ๋‹ˆ๋‹ค.

๊ฐ€๋กœ, ์„ธ๋กœ ์—์ง€์— ๋ฐ˜์‘ํ•˜๋Š” ํ•„ํ„ฐ

  • ์ถœ๋ ฅ ์ด๋ฏธ์ง€ 1์€ ์„ธ๋กœ ์—์ง€์— ํฐ ํ”ฝ์…€์ด ๋‚˜ํƒ€๋‚˜๊ณ , ์ถœ๋ ฅ ์ด๋ฏธ์ง€ 2๋Š” ๊ฐ€๋กœ ์—์ง€์— ํฐ ํ”ฝ์…€์ด ๋งŽ์ด ๋‚˜์˜ต๋‹ˆ๋‹ค.
  • ์ด๊ฑด ํ•™์Šต๋œ filter 2๊ฐœ๋ฅผ ์„ ํƒํ•˜์—ฌ ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ํ•ฉ์„ฑ๊ณฑ ์ฒ˜๋ฆฌ๋ฅผ ํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.
  • 'filter 1'์€ ์„ธ๋กœ ์—์ง€์—, 'filter 2'๋Š” ๊ฐ€๋กœ ์—์ง€์— ๋ฐ˜์‘์„ ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด์ฒ˜๋Ÿผ ํ•ฉ์„ฑ๊ณฑ ํ•„ํ„ฐ๋Š” ์—์ง€์™€ ๋ธ”๋กญ๋“ฑ์˜ ์›์‹œ์ ์ธ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Layer ๊นŠ์ด์— ๋”ฐ๋ฅธ ์ถ”์ถœ ์ •๋ณด์˜ ๋ณ€ํ™”

Layer(๊ณ„์ธต)์ด ๋” ๊นŠ์–ด์งˆ์ˆ˜๋ก ์ถ”์ถœ๋˜๋Š” ์ •๋ณด (์ •ํ™•ํžˆ๋Š” ๊ฐ•ํ•˜๊ฒŒ ๋ฐ˜์‘ํ•˜๋Š” ๋‰ด๋Ÿฐ)์€ ๋” ์ถ”์ƒํ™” ๋œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • 1๋ฒˆ์งธ ์ธต์˜ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต๋ง์—์„œ๋Š” ์—์ง€๋‚˜ ๋ธ”๋กญ ๋“ฑ์˜ ์ €์ˆ˜์ค€ ์ •๋ณด๊ฐ€ ์ถ”์ถœ๋˜๊ณ  ๊ณ„์ธต์ด ๊นŠ์–ด์งˆ์ˆ˜๋ก ์ถ”์ถœ๋˜๋Š” ์ •๋ณด๋Š” ๋” ์ถ”์ƒํ™”๋ฉ๋‹ˆ๋‹ค.
    • (์—์ง€ -> ํ…์Šค์ฒ˜ -> ์‚ฌ๋ฌผ์˜ ์ผ๋ถ€ ๋“ฑ)
  • ํ•ฉ์„ฑ๊ณฑ & Pooling ๊ณ„์ธต์„ ์—ฌ๋Ÿฌ๊ฒน ์Œ“๊ณ , ๋งˆ์ง€๋ง‰์œผ๋กœ Fully-Connected Layer(FC)๋ฅผ ๊ฑฐ์ณ ์ถœ๋ ฅํ•˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

CNN์˜ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์—์„œ ์ถ”์ถœ๋˜๋Š” ์ •๋ณด.

  • ์ฒซ๋ฒˆ์งธ ์ธต์€ ์—์ง€ & ๋ธ”๋กญ, 3๋ฒˆ์งธ ์ธต์€ ํ…์Šค์ฒ˜, 5๋ฒˆ์งธ ์ธต์€ ์‚ฌ๋ฌผ์˜ ์ผ๋ถ€, ๋งˆ์ง€๋ง‰ Fully-Connected Layer(FC)๋Š” ์‚ฌ๋ฌผ์˜ Class์— ๋‰ด๋Ÿฐ์ด ๋ฐ˜์‘ํ•ฉ๋‹ˆ๋‹ค.