๋จธ์ ๋ฌ๋ ๋ชจ๋ธ์ ํ์ตํ ๋ ์ค์ํ ์์์ค ํ๋์ธ Hyperparamter(ํ์ดํผ ํ๋ผ๋ฏธํฐ)์ ๋ฐํ์ฌ ์์๋ณด๊ฒ ์ต๋๋ค.
Hyperparameter? (ํ์ดํผ ํ๋ผ๋ฏธํฐ๋?)
ํ์ดํผ ํ๋ผ๋ฏธํฐ๋ ๋จธ์ ๋ฌ๋ ๋ชจ๋ธ์ ํ์ตํ๊ธฐ ์ ์ ์ค์ ํด์ผ ํ๋ ๊ฐ์ผ๋ก, ํ์ต ๊ณผ์ ์ค์๋ ๋ณ๊ฒฝ๋์ง ์์ต๋๋ค.
์ด๋ ๋ชจ๋ธ์ ์ฑ๋ฅ๊ณผ ํ์ต ์๋์ ํฐ ์ํฅ์ ๋ฏธ์น๋ ์ค์ํ ์์์ ๋๋ค.
Hyperparameter ์์
- ๊ฒฐ์ ํธ๋ฆฌ์ ์ต๋ ๊น์ด: ํธ๋ฆฌ๊ฐ ์ผ๋ง๋ ๊น๊ฒ ์ฑ์ฅํ ์ ์๋์ง๋ฅผ ๊ฒฐ์ ํ๋ฉฐ, ๋ชจ๋ธ์ ๋ณต์ก์ฑ์ ์กฐ์ ํฉ๋๋ค.
- SVM์ ์ปค๋ ์ข ๋ฅ: Support Vector Machine์์ ์ฌ์ฉํ๋ ์ปค๋์ ์ข ๋ฅ๋ฅผ ์ค์ ํ์ฌ, ๋ฐ์ดํฐ๋ฅผ ๋ณํํ๋ ๋ฐฉ๋ฒ์ ์ ์ํฉ๋๋ค.
- ์ ๊ฒฝ๋ง์ ํ์ต๋ฅ : ์ ๊ฒฝ๋ง์์ ๊ฐ์ค์น๋ฅผ ์ ๋ฐ์ดํธํ ๋ ์ฌ์ฉํ๋ ํ์ต๋ฅ ์ ๋ชจ๋ธ์ ์๋ ด ์๋์ ํ์ต ํ์ง์ ์ํฅ์ ์ค๋๋ค.
Hyperparameter Tuning (ํ์ดํผ ํ๋ผ๋ฏธํฐ ํ๋)
ํ์ดํผ ํ๋ผ๋ฏธํฐ ํ๋์ ๋ชจ๋ธ์ ์ฑ๋ฅ์ ์ต์ ํํ๊ธฐ ์ํด ํ์ดํผ ํ๋ผ๋ฏธํฐ์ ์ต์ ๊ฐ์ ์ฐพ๋ ๊ณผ์ ์ ๋๋ค.
์ด ๊ณผ์ ์ ๋ชจ๋ธ์ ์์ธก ์ฑ๋ฅ์ ๊ทน๋ํํ๊ณ , ๊ณผ์ ํฉ์ ๋ฐฉ์งํ๋ฉฐ, ํ์ต ์๊ฐ์ ์ต์ ํํ๋ ๋ฐ ์ค์ํ ์ญํ ์ ํฉ๋๋ค.
Hyperparameter Tuning์ ๋ชฉ์
- ๋ชจ๋ธ ์ฑ๋ฅ ์ต์ ํ: ์ ์ ํ ํ์ดํผ ํ๋ผ๋ฏธํฐ๋ฅผ ์ ํํจ์ผ๋ก์จ ๋ชจ๋ธ์ ์์ธก ์ฑ๋ฅ์ ๊ทน๋ํํฉ๋๋ค.
- ๊ณผ์ ํฉ ๋ฐฉ์ง: ํ์ดํผ ํ๋ผ๋ฏธํฐ๋ฅผ ์ ์ ํ๊ฒ ์ค์ ํ์ฌ ๋ชจ๋ธ์ด ํ๋ จ ๋ฐ์ดํฐ์ ๊ณผ์ ํฉ๋์ง ์๋๋ก ํฉ๋๋ค. ์ด๋ ๋ชจ๋ธ์ ์ผ๋ฐํ ์ฑ๋ฅ์ ๋์ด๋ ๋ฐ ๋์์ด ๋ฉ๋๋ค.
- ๊ณ์ฐ ๋น์ฉ ์ต์ ํ: ํจ์จ์ ์ธ ํ์ดํผ ํ๋ผ๋ฏธํฐ ์ค์ ์ ํตํด ๋ชจ๋ธ ํ์ต์ ์์๋๋ ์๊ฐ์ ์ค์ผ ์ ์์ต๋๋ค.
Hyperparameter Tuning ๊ณผ ๋ชจ๋ธ ์ฑ๋ฅ
- ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ฐ์ ๋ฐ๋ผ ๋ชจ๋ธ์ ํ์ต ์๋์ ์ฑ๋ฅ์ด ํฌ๊ฒ ๋ฌ๋ผ์ง ์ ์์ต๋๋ค. ์๋ฅผ ๋ค์ด, ๋๋ฌด ๋ฎ์ ํ์ต๋ฅ ์ ํ์ต ์๋๋ฅผ ๋ฆ์ถ๊ณ , ๋๋ฌด ๋์ ํ์ต๋ฅ ์ ์ต์ ์ ํด์ ๋๋ฌํ์ง ๋ชปํ ์ ์์ต๋๋ค.
- ์๋ชป๋ ํ์ดํผ ํ๋ผ๋ฏธํฐ ์ค์ ์ ๋ชจ๋ธ์ ๊ณผ์ ํฉ(Overfitting) ๋๋ ๊ณผ์์ ํฉ(Underfitting)์ ์ด๋ํ ์ ์์ต๋๋ค.
Hyperparameter Tuning ๋ฐฉ๋ฒ
- ๊ทธ๋ฆฌ๋ ์์น (Grid Search)
- ๋๋ค ์์น (Random Search)
๊ทธ๋ฆฌ๋ ์์น (Grid Search)
๊ทธ๋ฆฌ๋ ์์น๋ ํ์ดํผ ํ๋ผ๋ฏธํฐ์ ๊ฐ๋ฅํ ๋ชจ๋ ์กฐํฉ์ ํ์ํ์ฌ ์ต์ ์ ์กฐํฉ์ ์ฐพ๋ ๋ฐฉ๋ฒ์ ๋๋ค.
๊ทธ๋ฆฌ๋ ์์น์ ์๋ฆฌ
- ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ณต๊ฐ ์ ์
- ๊ฐ ํ์ดํผ ํ๋ผ๋ฏธํฐ์ ๋ํด ํ์ํ ๊ฐ๋ค์ ๋ฒ์๋ฅผ ์ ์ํฉ๋๋ค.
- ์กฐํฉ ํ์
- ๊ฐ๋ฅํ ๋ชจ๋ ํ์ดํผ ํ๋ผ๋ฏธํฐ ์กฐํฉ์ ๋ํด ๋ชจ๋ธ์ ํ์ตํ๊ณ ํ๊ฐํฉ๋๋ค.
- ์ต์ ์กฐํฉ ์ ํ
- ํ๊ฐ ์ฑ๋ฅ์ด ๊ฐ์ฅ ์ข์ ํ์ดํผ ํ๋ผ๋ฏธํฐ ์กฐํฉ์ ์ ํํฉ๋๋ค.
๊ทธ๋ฆฌ๋ ์์น์ ์ฅ์
- ๋ชจ๋ ์กฐํฉ์ ํ์ํ๋ฏ๋ก ์ต์ ์ ํ์ดํผ ํ๋ผ๋ฏธํฐ๋ฅผ ์ฐพ์ ๊ฐ๋ฅ์ฑ์ด ๋์ต๋๋ค.
๊ทธ๋ฆฌ๋ ์์น์ ๋จ์
- ๊ณ์ฐ ๋น์ฉ์ด ๋ง์ด ๋ค๊ณ , ์๊ฐ์ด ๋ง์ด ์๋ชจ๋ฉ๋๋ค.
- ํนํ ํ์ดํผ ํ๋ผ๋ฏธํฐ๊ฐ ๋ง๊ฑฐ๋ ๊ฐ์ ๋ฒ์๊ฐ ๋์ ๊ฒฝ์ฐ, ํ์ ์๊ฐ์ด ๊ธฐํ๊ธ์์ ์ผ๋ก ์ฆ๊ฐํ ์ ์์ต๋๋ค.
๋๋ค ์์น (Random Search)
๋๋ค ์์น๋ ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ณต๊ฐ์์ ๋ฌด์์๋ก ์กฐํฉ์ ์ ํํ์ฌ ํ์ํ๋ ๋ฐฉ๋ฒ์ ๋๋ค.
๋๋ค ์์น์ ์๋ฆฌ
- ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ณต๊ฐ ์ ์
- ๊ฐ ํ์ดํผ ํ๋ผ๋ฏธํฐ์ ๋ํด ํ์ํ ๊ฐ๋ค์ ๋ฒ์๋ฅผ ์ ์ํฉ๋๋ค.
- ๋ฌด์์ ์กฐํฉ ์ ํ
- ์ ์๋ ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ณต๊ฐ์์ ๋ฌด์์๋ก ์กฐํฉ์ ์ ํํฉ๋๋ค.
- ์ต์ ์กฐํฉ ์ ํ
- ๋ฌด์์๋ก ์ ํ๋ ์กฐํฉ๋ค ์ค์์ ํ๊ฐ ์ฑ๋ฅ์ด ๊ฐ์ฅ ์ข์ ์กฐํฉ์ ์ ํํฉ๋๋ค.
๋๋ค ์์น์ ์ฅ์
- ๊ทธ๋ฆฌ๋ ์์น๋ณด๋ค ๊ณ์ฐ ๋น์ฉ์ด ์ ๊ณ , ๋น ๋ฅด๊ฒ ํ์ํ ์ ์์ต๋๋ค. ํนํ, ํ์ดํผ ํ๋ผ๋ฏธํฐ๊ฐ ๋ง์ ๋ ์ ์ฉํฉ๋๋ค.
๋๋ค ์์น์ ๋จ์
- ๋ฌด์์ ์ ํ์ด๋ฏ๋ก, ์ต์ ์ ํ์ดํผ ํ๋ผ๋ฏธํฐ๋ฅผ ์ฐพ์ง ๋ชปํ ๊ฐ๋ฅ์ฑ์ด ์์ต๋๋ค.
- ๊ทธ๋ฌ๋ ์ ์ ํ ํ์ ๋ฒ์๋ฅผ ์ค์ ํ๋ฉด, ์ค์ง์ ์ผ๋ก ๊ทธ๋ฆฌ๋ ์์น์ ์ ์ฌํ ์ฑ๋ฅ์ ์ป์ ์ ์์ต๋๋ค.
Hyperparameter Tuning - Grid Search Example Code
# ํ์ํ ๋ผ์ด๋ธ๋ฌ๋ฆฌ ์ํฌํธ
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns
# Iris ๋ฐ์ดํฐ์
๋ก๋
iris = load_iris()
X = iris.data
y = iris.target
# ํ์ต ๋ฐ์ดํฐ์ ํ
์คํธ ๋ฐ์ดํฐ๋ก ๋ถํ
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# SVM ๋ชจ๋ธ ์์ฑ
svc = SVC()
# ํ์ดํผ ํ๋ผ๋ฏธํฐ ๊ทธ๋ฆฌ๋ ์ ์
param_grid = {
'C': [0.1, 1, 10, 100],
'gamma': [1, 0.1, 0.01, 0.001],
'kernel': ['rbf']
}
# Grid Search
grid = GridSearchCV(svc, param_grid, refit=True, verbose=2)
grid.fit(X_train, y_train)
# ์ต์ ํ์ดํผ ํ๋ผ๋ฏธํฐ ์ถ๋ ฅ
print(f'Best Parameters: {grid.best_params_}')
# ํ
์คํธ ๋ฐ์ดํฐ๋ก ์์ธก ์ํ
y_pred = grid.predict(X_test)
# ์ ํ๋ ๊ณ์ฐ
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
# ํผ๋ ํ๋ ฌ ์๊ฐํ
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(10, 7))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=iris.target_names, yticklabels=iris.target_names)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()
# ๋ถ๋ฅ ๋ฆฌํฌํธ ์ถ๋ ฅ
print(classification_report(y_test, y_pred, target_names=iris.target_names))
Fitting 5 folds for each of 16 candidates, totalling 80 fits
[CV] END .........................C=0.1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=0.1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=0.1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=0.1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=0.1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .......................C=0.1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=0.1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=0.1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=0.1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=0.1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ......................C=0.1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=0.1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=0.1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=0.1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=0.1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .....................C=0.1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=0.1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=0.1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=0.1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=0.1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ...........................C=1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ...........................C=1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ...........................C=1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ...........................C=1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ...........................C=1, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .........................C=1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .........................C=1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .........................C=1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .........................C=1, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ........................C=1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ........................C=1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ........................C=1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ........................C=1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ........................C=1, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .......................C=1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .......................C=1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .......................C=1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .......................C=1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .......................C=1, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ..........................C=10, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ..........................C=10, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ..........................C=10, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ..........................C=10, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ..........................C=10, gamma=1, kernel=rbf; total time= 0.0s
[CV] END ........................C=10, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ........................C=10, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ........................C=10, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ........................C=10, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ........................C=10, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=10, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .......................C=10, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .......................C=10, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .......................C=10, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .......................C=10, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=10, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ......................C=10, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ......................C=10, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ......................C=10, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END ......................C=10, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .........................C=100, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=100, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=100, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=100, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .........................C=100, gamma=1, kernel=rbf; total time= 0.0s
[CV] END .......................C=100, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=100, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=100, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=100, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END .......................C=100, gamma=0.1, kernel=rbf; total time= 0.0s
[CV] END ......................C=100, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=100, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=100, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=100, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END ......................C=100, gamma=0.01, kernel=rbf; total time= 0.0s
[CV] END .....................C=100, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=100, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=100, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=100, gamma=0.001, kernel=rbf; total time= 0.0s
[CV] END .....................C=100, gamma=0.001, kernel=rbf; total time= 0.0s
Best Parameters: {'C': 100, 'gamma': 0.01, 'kernel': 'rbf'}
Accuracy: 1.0
precision recall f1-score support
setosa 1.00 1.00 1.00 19
versicolor 1.00 1.00 1.00 13
virginica 1.00 1.00 1.00 13
accuracy 1.00 45
macro avg 1.00 1.00 1.00 45
weighted avg 1.00 1.00 1.00 45
'๐ Machine Learning' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
[ML] Reinforcement Learning (๊ฐํ ํ์ต) - Q-Learning (0) | 2024.08.27 |
---|---|
[ML] Recommender System (์ถ์ฒ์์คํ ) (0) | 2024.08.26 |
[ML] Emsemble Methods (์์๋ธ ๊ธฐ๋ฒ) (0) | 2024.08.23 |
[ML] ์ฐ๊ด ๊ท์น ํ์ต (Association Rule Learning) (0) | 2024.08.22 |
[ML] t-SNE (t-Distributed Stochastic Neighbor Embedding) (0) | 2024.08.20 |