Home > Study > Linux > Day6 : CNN

Day6 : CNN
Study Language

๐Ÿ“Œ CNN๋ž€?


CNN(Convolutional Neural Network)์€ ์ด๋ฏธ์ง€ ์ธ์‹๊ณผ ๋ถ„๋ฅ˜์— ํŠนํ™”๋œ ์ธ๊ณต์‹ ๊ฒฝ๋ง์œผ๋กœ, ์‚ฌ๋žŒ์˜ ์‹œ๊ฐ ์ฒ˜๋ฆฌ ๋ฐฉ์‹๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๊ตญ์†Œ์ ์ธ ์˜์—ญ์„ ์ค‘์‹ฌ์œผ๋กœ ํŠน์ง•(feature)์„ ์ถ”์ถœํ•˜๊ณ  ํ•™์Šตํ•œ๋‹ค. ๊ธฐ์กด์˜ MLP๋ณด๋‹ค ์ด๋ฏธ์ง€ ๊ตฌ์กฐ๋ฅผ ๋” ์ž˜ ๋ฐ˜์˜ํ•˜๋ฉฐ, ์ปดํ“จํ„ฐ ๋น„์ „(CV) ๋ถ„์•ผ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋œ๋‹ค.


<ํ•ฉ์„ฑ๊ณฑ ์ธต - Convolution Layer>

  • ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ํ•„ํ„ฐ(์ปค๋„)๋ฅผ ์ ์šฉํ•ด ํŠน์ง• ๋งต(Feature Map) ์ƒ์„ฑ
  • ๋ณดํ†ต 3ร—3 ํฌ๊ธฐ์˜ ํ•„ํ„ฐ ์‚ฌ์šฉ (VGGNet ๋“ฑ) โ†’ ์ž‘์„์ˆ˜๋ก ๋‹ค์–‘ํ•œ feature ์ถ”์ถœ ๊ฐ€๋Šฅ
  • ํ•„ํ„ฐ์˜ ๋‘๊ป˜๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ(์˜ˆ: RGB โ†’ 3)์— ์ž๋™ ๋งž์ถฐ์ง
  • Stride: ํ•„ํ„ฐ ์ด๋™ ๊ฐ„๊ฒฉ, ์ž‘์„์ˆ˜๋ก ์ •๋ฐ€ํ•˜๊ณ  ํด์ˆ˜๋ก ๋น ๋ฅด๊ฒŒ ์ฒ˜๋ฆฌ๋จ
  • Padding: ์ถœ๋ ฅ feature map ํฌ๊ธฐ๋ฅผ ์œ ์ง€ํ•˜๋ ค๋ฉด padding=same ์„ค์ •
  • ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ ๋’ค์—๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜(ReLU)๋ฅผ ์ ์šฉํ•ด ๋น„์„ ํ˜•์„ฑ ๋„์ž…

<ํ’€๋ง ์ธต - Pooling Layer>

  • MaxPooling: ํ’€๋ง ์˜์—ญ์˜ ์ตœ๋Œ€๊ฐ’ โ†’ ์ฃผ์š” ํŠน์ง•๋งŒ ๊ฐ•์กฐ
  • AveragePooling: ์˜์—ญ ๋‚ด ํ‰๊ท ๊ฐ’ ์‚ฌ์šฉ
  • GlobalAveragePooling: Flatten ์—†์ด ์ „์ฒด ํ‰๊ท ๋งŒ ๋ฝ‘์•„๋‚ด๋Š” ๋ฐฉ์‹ (GoogLeNet)
  • ์—ฐ์‚ฐ๋Ÿ‰ ๊ฐ์†Œ + ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€ + ๊ณต๊ฐ„ ๊ตฌ์กฐ ์š”์•ฝ

<๋ฐ€์ง‘์ธต - Fully Connected Layer>

  • Flatten ๋ ˆ์ด์–ด๋กœ feature map์„ 1์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜
  • ์ดํ›„ Fully Connected Layer๋ฅผ ๊ฑฐ์ณ ํด๋ž˜์Šค๋ณ„ ์ถœ๋ ฅ๊ฐ’ ์ƒ์„ฑ
  • ์ฃผ๋กœ softmax๋ฅผ ์ถœ๋ ฅ์ธต ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ•ด ํ™•๋ฅ ๊ฐ’ ๋„์ถœ

๐Ÿงฎ example flow

  • ์˜ˆ: ์ตœ์ข… ์ถœ๋ ฅ๊ฐ’์ด (0.7, 0)์ด๊ณ  ์ •๋‹ต์ด (1, 0)์ธ ๊ฒฝ์šฐ โ†’ 0.3 ์˜ค์ฐจ
  • ์ด ์˜ค์ฐจ๋ฅผ ์—ญ์ „ํŒŒ(backpropagation)๋กœ ์ „ํŒŒํ•˜๋ฉฐ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ
  • ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(gradient descent) ๋“ฑ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉ

๐Ÿ‘๏ธ CNN ๋ชจ๋ธ๊ณผ ์ธ๊ฐ„ ์‹œ๊ฐ ์ฒ˜๋ฆฌ์˜ ์œ ์‚ฌ์„ฑ

  • ์ธ๊ฐ„์˜ ์‹œ๊ฐ ํ”ผ์งˆ๋„ ๋‹จ์ˆœํ•œ ์‹œ๊ฐ ์ •๋ณด โ†’ ๋ณต์žกํ•œ ํŠน์ง• ์ˆœ์œผ๋กœ ์ฒ˜๋ฆฌ
  • CNN๋„ ์ธต์ด ๊นŠ์–ด์งˆ์ˆ˜๋ก ๋ณต์žกํ•œ feature๋ฅผ ์ถ”์ถœ
  • ์ €์ฐจ์› edge โ†’ ๊ณ ์ฐจ์› ํŒจํ„ด ์ถ”์ถœ ํ๋ฆ„์ด ์‹œ๊ฐ ์ •๋ณด ์ฒ˜๋ฆฌ์™€ ๋‹ฎ์Œ

๐Ÿง  ๋Œ€ํ‘œ CNN ๊ตฌ์กฐ๋“ค

  • AlexNet (2012): CNN์„ ์œ ๋ช…ํ•˜๊ฒŒ ๋งŒ๋“  ์ตœ์ดˆ์˜ ๊ตฌ์กฐ, 8์ธต ๊ตฌ์„ฑ
  • VGGNet (2014): 3ร—3 ํ•„ํ„ฐ ๋ฐ˜๋ณต ์‚ฌ์šฉ, ๊ตฌ์กฐ ๋‹จ์ˆœ & ํšจ๊ณผ์ 
  • GoogLeNet: Inception ๊ตฌ์กฐ + Global Average Pooling ์‚ฌ์šฉ
  • ResNet: Residual Block ์‚ฌ์šฉ โ†’ ์ธต์ด ๊นŠ์–ด์ ธ๋„ ์„ฑ๋Šฅ ์œ ์ง€

๐Ÿ” ์ „์ดํ•™์Šต (Transfer Learning)

  • ๊ธฐ์กด ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์žฌ์‚ฌ์šฉํ•˜์—ฌ ์ ์€ ๋ฐ์ดํ„ฐ๋กœ๋„ ํ•™์Šต ๊ฐ€๋Šฅ
  • Feature Extraction: ๊ธฐ์กด ๊ตฌ์กฐ ์œ ์ง€, ์ถœ๋ ฅ์ธต๋งŒ ์ƒˆ๋กœ ํ•™์Šต
  • Fine-Tuning: ์ผ๋ถ€ ์ธต์€ ๊ณ ์ •, ๋‚˜๋จธ์ง€๋Š” ์žฌํ•™์Šต
  • ์ ์€ ๋ฐ์ดํ„ฐ ์ƒํ™ฉ์—์„œ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ ๋ฐœํœ˜ ๊ฐ€๋Šฅ

๐Ÿ‘จโ€๐Ÿ’ป ์‹ค์Šต


๐Ÿ’ก Code : CNN Layer ๊ตฌํ˜„

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

# ํ•ฉ์„ฑ๊ณฑ ํ•จ์ˆ˜ ๊ตฌํ˜„
def conv(a, b): 
    c = np.array(a) * np.array(b)
    return np.sum(c)

# MaxPooling ํ•จ์ˆ˜ ๊ตฌํ˜„(ํ•œ ๊ฐœ์˜ map ๊ณ„์‚ฐ)
def MaxPooling(nimg):  # 2d input
    nimg = np.array(nimg)
    i0, j0 = nimg.shape  # i0 = nimg.shape[0], j0 = nimg.shape[1]
    i1 = int((i0 + 1) / 2)
    j1 = int((j0 + 1) / 2)
    output = np.zeros((i1, j1))

    if i0 % 2 == 1:
        i0 += 1
        tmp = np.zeros((1, j0))
        nimg = np.concatenate([nimg, tmp], axis=0)

    if j0 % 2 == 1:
        j0 += 1
        tmp = np.zeros((i0, 1))
        nimg = np.concatenate([nimg, tmp], axis=1)

    for i in range(output.shape[0]):
        for j in range(output.shape[1]):
            a = np.array(nimg[2*i:2*i+2, 2*j:2*j+2])
            output[i, j] = a.max()
    
    return output

# ํ•ฉ์„ฑ๊ณฑ ์ถœ๋ ฅ ์ธต(reature map) ํ•จ์ˆ˜ ๊ตฌํ˜„(ํ•œ ๊ฐœ์˜ filter ๊ณ„์‚ฐ)
def featuring(nimg, filters):
    feature = np.zeros((nimg.shape[0] - 2, nimg.shape[1] - 2))
    for i in range(feature.shape[0]):
        for j in range(feature.shape[1]):
            a = nimg[i:i+3, j:j+3]
            feature[i, j] = conv(a, filters)
    return feature

# MaxPooling ์ถœ๋ ฅ ์ธต ํ•จ์ˆ˜ ๊ตฌํ˜„(์—ฌ๋Ÿฌ map ๊ณ„์‚ฐ)
def Pooling(nimg):
    nimg = np.array(nimg)
    pool0 = []
    for i in range(len(nimg)):
        pool0.append(MaxPooling(nimg[i]))
    return pool0

# ๋ฐฐ์—ด์„ ๊ทธ๋ฆผ์œผ๋กœ ๋ณ€ํ™˜
def to_img(nimg):
    nimg = np.array(nimg)
    nimg = np.uint8(np.round(nimg))
    fimg = []
    for i in range(len(nimg)):
        fimg.append(Image.fromarray(nimg[i]))
    return fimg

# feature map ์ƒ์„ฑ(์—ฌ๋Ÿฌ filter ๊ณ„์‚ฐ)
def ConvD(nimg, filters):
    nimg = np.array(nimg)
    feat0 = []
    for i in range(len(filters)):
        feat0.append(featuring(nimg, filters[i]))
    return feat0

# ReLU ํ™œ์„ฑํ™” ํ•จ์ˆ˜
def ReLU(fo):
    fo = np.array(fo)
    fo = (fo > 0) * fo
    return fo

# CNN Layer ํ•จ์ˆ˜ : Conv + ReLU + MaxPooling
def ConvMax(nimg, filters):
    nimg = np.array(nimg)
    f0 = ConvD(nimg, filters)
    f0 = ReLU(f0)
    fg = Pooling(f0)
    return f0, fg

# ๊ทธ๋ฆผ ๊ทธ๋ฆฌ๊ธฐ : ํ•ฉ์„ฑ๊ณฑ ํ›„์˜ ์ƒํƒœ์™€ MaxPooling ํ›„์˜ ์ƒํƒœ๋ฅผ ๊ทธ๋ฆผ์œผ๋กœ ๊ทธ๋ฆฌ๊ธฐ
def draw(f0, fg0, size=(12, 8), k=-1):  # size์™€ k๋Š” ๊ธฐ๋ณธ๊ฐ’ ์„ค์ •
    plt.figure(figsize=size)

    for i in range(len(f0)):
        plt.subplot(2, len(f0), i + 1)
        plt.gca().set_title('Conv' + str(k) + '-' + str(i))
        plt.imshow(f0[i])

    for i in range(len(fg0)):
        plt.subplot(2, len(fg0), len(f0) + i + 1)
        plt.gca().set_title('MaxP' + str(k) + '-' + str(i))
        plt.imshow(fg0[i])

    if k != -1:  # k=-1์ด ์•„๋‹ˆ๋ฉด ๊ทธ๋ฆผ์„ ์ €์žฅ
        plt.savefig('conv' + str(k) + '.png')

# 3๊ฐœ์˜ activation map ํ•ฉ์น˜๊ธฐ : MaxPooling ํ›„์˜ ๊ฒฐ๊ณผ map๋“ค์„ ํ•˜๋‚˜์˜ ๋ฐ์ดํ„ฐ๋กœ ํ†ตํ•ฉ
def join(mm):
    mm = np.array(mm)
    m1 = np.zeros((mm.shape[1], mm.shape[2], mm.shape[0]))
    for i in range(mm.shape[1]):
        for j in range(mm.shape[2]):
            for k in range(mm.shape[0]):
                m1[i][j][k] = mm[k][i][j]
    return m1

# CNN Layer ๊ณผ์ •์„ ๊ณ„์‚ฐํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ๊ทธ๋ฆผ์œผ๋กœ ์ถœ๋ ฅ
def ConvDraw(p0, filters, size=(12, 8), k=-1):
    f0, fg0 = ConvMax(p0, filters)
    f0_img = to_img(f0)
    fg1_img = to_img(fg0)
    draw(f0, fg0, size, k)
    p1 = join(fg0)
    return p1

# ํ…Œ์ŠคํŠธ ์‹คํ–‰
nimg31 = np.random.rand(10, 10)
filters = [np.ones((3, 3))] * 3

m0 = ConvDraw(nimg31, filters, (12, 10), 0)

โœ… Result : CNN Layer ๊ตฌํ˜„

alt text


๐Ÿ“ ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด ํ™˜๊ฒฝ ๊ตฌ์ถ•


โœ… ์„ค์น˜ ๋ฐ ์ด๋ฏธ์ง€ ์„ค์ •

  • sudo apt install rpi-imager : ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด ์ด๋ฏธ์ง€ ๋„๊ตฌ ์„ค์น˜
  • rpi-imager : GUI ์‹คํ–‰ ํ›„ OS ์ด๋ฏธ์ง€ ๋‹ค์šด๋กœ๋“œ ๋ฐ ์„ค์น˜ ๊ฐ€๋Šฅ

โš™๏ธ ์„ค์ • ์ •๋ณด

  • ์šด์˜์ฒด์ œ: Raspberry Pi OS (64-bit)
  • ์ €์žฅ์†Œ: Mass Storage Device - 62.5 GB