Home > Study > Linux > Day4 : Deep Learning

Day4 : Deep Learning
Study Language

๐Ÿง  ๋”ฅ๋Ÿฌ๋‹ ๊ฐœ์š”


๋”ฅ๋Ÿฌ๋‹์€ ์ธ๊ฐ„์˜ ๋‡Œ๋ฅผ ๋ชจ๋ฐฉํ•œ ์ธ๊ณต์‹ ๊ฒฝ๋ง(ANN)์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ์ˆ ๋กœ, ๋‹ค์ธต ๊ตฌ์กฐ์˜ ์‹ ๊ฒฝ๋ง์„ ํ†ตํ•ด ๋ณต์žกํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•˜๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.


๐Ÿ” ์ธ๊ณต์‹ ๊ฒฝ๋ง(ANN)์˜ ๊ฐœ๋…

  • ์ƒ๋ฌผํ•™์  ๋‰ด๋Ÿฐ ๊ตฌ์กฐ์—์„œ ์ฐฉ์•ˆ.
  • ์ž…๋ ฅ โ†’ ๊ฐ€์ค‘์น˜ โ†’ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ โ†’ ์ถœ๋ ฅ ํ๋ฆ„์œผ๋กœ ๋™์ž‘.
  • ๊ฐ ์‹ ํ˜ธ์˜ ๊ฐ•๋„๋Š” ๊ฐ€์ค‘์น˜(Weight)๋กœ ํ‘œํ˜„๋จ.

๐Ÿงฌ ๋”ฅ๋Ÿฌ๋‹(Deep Learning)์ด๋ž€?

  • ์€๋‹‰์ธต์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง(Deep Neural Network, DNN)์„ ํ†ตํ•ด ํ•™์Šตํ•˜๋Š” ๋ฐฉ์‹.
  • ์‹ฌ์ธต ํ•™์Šต(Deep Learning)์ด๋ผ๊ณ ๋„ ํ•จ.

๐Ÿ› ๏ธ ์‹ ๊ฒฝ๋ง ๊ตฌ์„ฑ ์š”์†Œ

๊ตฌ์„ฑ ์š”์†Œ ์„ค๋ช…
์ž…๋ ฅ์ธต ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด์˜ค๋Š” ์ธต
์€๋‹‰์ธต ๊ฐ€์ค‘ํ•ฉ ๊ณ„์‚ฐ ๋ฐ ๋น„์„ ํ˜• ๋ณ€ํ™˜ ์ˆ˜ํ–‰
์ถœ๋ ฅ์ธต ์ตœ์ข… ์˜ˆ์ธก๊ฐ’์„ ์ถœ๋ ฅ
๊ฐ€์ค‘์น˜ ์ž…๋ ฅ์˜ ์ค‘์š”๋„๋ฅผ ๊ฒฐ์ •
ํŽธํ–ฅ ๊ฐ€์ค‘ํ•ฉ์— ๋”ํ•ด์ง€๋Š” ์ƒ์ˆ˜๋กœ ์ถœ๋ ฅ ์กฐ์ ˆ

โž• ๊ฐ€์ค‘ํ•ฉ (Weighted Sum)

  • ๊ฐ ์ž…๋ ฅ๊ฐ’ ร— ๊ฐ€์ค‘์น˜ + ํŽธํ–ฅ
  • ์ˆ˜์‹: z = wโ‚xโ‚ + wโ‚‚xโ‚‚ + โ€ฆ + b

โš™๏ธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ (Activation Function)

ํ•จ์ˆ˜๋ช… ํŠน์ง•
Sigmoid S์ž ํ˜•ํƒœ, ์ถœ๋ ฅ๊ฐ’ [0, 1], ๊ธฐ์šธ๊ธฐ ์†Œ์‹ค ๋ฌธ์ œ
Tanh ์ถœ๋ ฅ [-1, 1], ํ‰๊ท  0, sigmoid๋ณด๋‹ค ์šฐ์ˆ˜
ReLU 0 ์ดํ•˜ โ†’ 0, 0 ์ดˆ๊ณผ โ†’ ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅ, ๋น ๋ฅธ ํ•™์Šต
LeakyReLU ReLU์˜ ์Œ์ˆ˜ ์ž…๋ ฅ ๋ฌด๋ฐ˜์‘ ๋ฌธ์ œ ํ•ด๊ฒฐ
Softmax ํ™•๋ฅ  ๋ถ„ํฌ ์ถœ๋ ฅ, ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜์— ์‚ฌ์šฉ

๐Ÿงญ ํ•™์Šต ๊ณผ์ • (Training Flow)

1๏ธโƒฃ ์ˆœ์ „ํŒŒ (Forward Propagation)

  • ์ž…๋ ฅ โ†’ ์€๋‹‰์ธต โ†’ ์ถœ๋ ฅ์ธต์œผ๋กœ ์˜ˆ์ธก๊ฐ’ ๋„์ถœ

2๏ธโƒฃ ์†์‹ค ํ•จ์ˆ˜ (Loss Function)

  • ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ
    • ํšŒ๊ท€: MSE
    • ๋ถ„๋ฅ˜: Cross Entropy

3๏ธโƒฃ ์˜ตํ‹ฐ๋งˆ์ด์ € (Optimizer)

  • ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ€์ค‘์น˜ ์ตœ์ ํ™”
  • ์ „์ฒด/๋ฏธ๋‹ˆ ๋ฐฐ์น˜ ๋ฐฉ์‹ ์‚ฌ์šฉ

4๏ธโƒฃ ์—ญ์ „ํŒŒ (Backpropagation)

  • ์˜ค์ฐจ๋ฅผ ์—ญ๋ฐฉํ–ฅ ์ „ํŒŒํ•ด ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ
  • ๊ฐ ์ธต์˜ ๊ฐ€์ค‘์น˜์— ๋Œ€ํ•ด ๋ฏธ๋ถ„๊ฐ’ ๊ธฐ๋ฐ˜ ๋ณด์ •

๐Ÿงฑ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์œ ํ˜•

์œ ํ˜• ์„ค๋ช…
DFN (์ˆœ๋ฐฉํ–ฅ ์‹ ๊ฒฝ๋ง) ๊ธฐ๋ณธ ๊ตฌ์กฐ, ๊ณ ์ • ์ž…๋ ฅ ์ฒ˜๋ฆฌ
RNN (์ˆœํ™˜ ์‹ ๊ฒฝ๋ง) ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ, ๊ณผ๊ฑฐ ์ •๋ณด ๋ฐ˜์˜
LSTM RNN ๊ฐœ์„ , ์žฅ๊ธฐ ๊ธฐ์–ต ์œ ์ง€
CNN ์ด๋ฏธ์ง€ ๋ถ„์„ ํŠนํ™”, ํ•ฉ์„ฑ๊ณฑ ๋ฐ ํ’€๋ง ํ™œ์šฉ

๐Ÿง  CNN์˜ ๊ตฌ์กฐ

  • ํ•ฉ์„ฑ๊ณฑ์ธต: ํ•„ํ„ฐ๋ฅผ ํ†ตํ•ด ํŠน์ง• ์ถ”์ถœ
  • ํ’€๋ง์ธต: ๋ฐ์ดํ„ฐ ํฌ๊ธฐ ์ถ•์†Œ, ํ•ต์‹ฌ ์ •๋ณด ๋ณด์กด
  • ์™„์ „์—ฐ๊ฒฐ์ธต: ์ตœ์ข… ๋ถ„๋ฅ˜ ์ˆ˜ํ–‰

๐Ÿ”„ ๋น„๊ต ์š”์•ฝ (DFN vs RNN vs CNN)

ํ•ญ๋ชฉ DFN RNN CNN
์ž…๋ ฅ ์ •์  ์‹œ๊ณ„์—ด ์ด๋ฏธ์ง€/์‹œ๊ณ„์—ด
ํŠน์ง• ๋‹จ๋ฐฉํ–ฅ ์ˆœํ™˜ ์—ฐ๊ฒฐ ์ง€์—ญ์  ํŠน์ง•
ํ•™์Šต ์‰ฌ์›€ ์–ด๋ ค์›€ ์ค‘๊ฐ„
ํšจ์œจ ๋‚ฎ์Œ ๋‚ฎ์Œ ๋†’์Œ

๐Ÿ’ฌ ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ (Word Embedding)

๋ฐฉ์‹ ์„ค๋ช…
One-hot Encoding ํฌ์†Œ ๋ฒกํ„ฐ, ๋‹จ์ˆœ ๊ตฌ์กฐ
Word2Vec ์ฃผ๋ณ€ ๋ฌธ๋งฅ โ†’ ์ค‘์‹ฌ ๋‹จ์–ด ์˜ˆ์ธก (CBOW/Skip-gram)
TF-IDF ๋‹จ์–ด ์ค‘์š”๋„ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ
FastText ๋ถ€๋ถ„ ๋‹จ์–ด ๊ธฐ๋ฐ˜, OOV ๋ฌธ์ œ ํ•ด๊ฒฐ
GloVe ๋‹จ์–ด ๋™์‹œ ๋“ฑ์žฅ ํ†ต๊ณ„ ๊ธฐ๋ฐ˜
ELMo ๋ฌธ๋งฅ์— ๋”ฐ๋ผ ๋ฒกํ„ฐ๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š” ๋™์  ์ž„๋ฒ ๋”ฉ

๐ŸŽจ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง (GAN)

  • ๋‘ ๋„คํŠธ์›Œํฌ๊ฐ€ ๊ฒฝ์Ÿ:
    • Generator: ์ง„์งœ ๊ฐ™์€ ๊ฐ€์งœ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
    • Discriminator: ์ง„์งœ์™€ ๊ฐ€์งœ ๊ตฌ๋ณ„
  • ์˜ˆ์ˆ , ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋“ฑ์—์„œ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ

โœ… ์š”์•ฝ

  • ๋”ฅ๋Ÿฌ๋‹์€ ์ธ๊ณต์‹ ๊ฒฝ๋ง์„ ํ™•์žฅํ•œ ๊ตฌ์กฐ๋กœ, ๋‹ค์–‘ํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ์— ์ ์šฉ ๊ฐ€๋Šฅ
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜, ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜, ๋ชจ๋ธ ๊ตฌ์กฐ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ์ขŒ์šฐ๋จ
  • CNN, RNN, GAN, Word Embedding ๋“ฑ์€ ์‹ค์ „ ๋ฌธ์ œ์— ๋งž๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ• ์„ ํƒ์˜ ๊ธฐ์ค€์ด ๋œ๋‹ค.

๐Ÿ› ๏ธ ์ž‘์—…ํ•  ๋””๋ ‰ํ† ๋ฆฌ ์ƒ์„ฑ ๋ฐ ํ™˜๊ฒฝ ์„ค์ •


# 1. ์ž‘์—… ๋””๋ ‰ํ† ๋ฆฌ ์ƒ์„ฑ
mkdir F_MNIST                  # ๋””๋ ‰ํ† ๋ฆฌ ์ด๋ฆ„: F_MNIST
cd F_MNIST                     # ํ•ด๋‹น ๋””๋ ‰ํ† ๋ฆฌ๋กœ ์ด๋™

# 2. ๊ฐ€์ƒ ํ™˜๊ฒฝ ์ƒ์„ฑ ๋ฐ ํ™œ์„ฑํ™”
python3 -m venv .fmnist        # ๊ฐ€์ƒ ํ™˜๊ฒฝ ์ƒ์„ฑ (ํด๋” ์ด๋ฆ„: .fmnist)
source .fmnist/bin/activate    # ๊ฐ€์ƒ ํ™˜๊ฒฝ ํ™œ์„ฑํ™”

# 3. ํŒจํ‚ค์ง€ ์„ค์น˜
pip install -U pip             # pip ์ตœ์‹  ๋ฒ„์ „์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œ
pip install tensorflow         # TensorFlow (๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ)
pip install matplotlib         # Matplotlib (์‹œ๊ฐํ™” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ)
pip install PyQt5              # PyQt5 (Matplotlib GUI ๋ฐฑ์—”๋“œ์šฉ)
pip install scikit_learn       # scikit-learn (๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐ ํ‰๊ฐ€ ๋„๊ตฌ)

# 4. Qt GUI ๋ฐฑ์—”๋“œ ์„ค์ • (Wayland ํ™˜๊ฒฝ์—์„œ ํ•„์ˆ˜)
export QT_QPA_PLATFORM=wayland # Qt GUI๋ฅผ Wayland์—์„œ ์ •์ƒ ๋™์ž‘ํ•˜๊ฒŒ ์„ค์ •

๐Ÿ‘จโ€๐Ÿ’ป ์‹ค์Šต


๐Ÿ’ก Code : Fashion MNIST

import tensorflow as tf
from tensorflow import keras

import numpy as np
import matplotlib
import matplotlib.pyplot as plt

# dataset load
fashion_mnist = keras.datasets.fashion_mnist

# spilt data (train / test)
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

print(train_images.shape)
print(train_labels.shape)
print(test_images.shape)
print(test_labels.shape)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
               
matplotlib.use('Qt5Agg')
NUM=20
plt.figure(figsize=(15,15))
plt.subplots_adjust(hspace=1)
for idx in range(NUM):
    sp = plt.subplot(5,5,idx+1)
    plt.imshow(train_images[idx])
    plt.title(f'{class_names[train_labels[idx]]}')
plt.show()

plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

# ๊ฐ„๋‹จํ•œ ์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌ (for ANN)
train_images = train_images / 255.0
test_images = test_images / 255.0

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

plt.figure(figsize=(10,8))
for i in range(20):
    plt.subplot(4,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

model = keras.Sequential ([
    keras.layers.Flatten(input_shape=(28,28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax'),
])

model.summary()

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=20)

predictions = model.predict(test_images)

predictions[0]

np.argmax(predictions[0])

test_labels[0]

def plot_image(i, predictions_array, true_label, img):
  predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])

  plt.imshow(img, cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'

  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  predictions_array, true_label = predictions_array[i], true_label[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)

  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
  plt.subplot(num_rows, 2*num_cols, 2*i+1)
  plot_image(i, predictions, test_labels, test_images)
  plt.subplot(num_rows, 2*num_cols, 2*i+2)
  plot_value_array(i, predictions, test_labels)
plt.show()

from sklearn.metrics import accuracy_score
print('accuracy score : ', accuracy_score(tf.math.argmax(predictions, -1), test_labels))