A A
[ํ˜ผ๊ณต๋จธ์‹ ] Logistic Regression (๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€)

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€

๋Ÿญํ‚ค๋ฐฑ์˜ ํ™•๋ฅ 

K-์ตœ๊ทผ์ ‘ ์ด์›ƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ฃผ๋ณ€ ์ด์›ƒ์„ ์ฐพ์•„์ฃผ๋‹ˆ๊นŒ ์ด์›ƒ์˜ ํด๋ž˜์Šค ๋น„์œจ์„ ํ™•๋ฅ ์ด๋ผ๊ณ  ์ถœ๋ ฅํ•˜๋ฉด ๋ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

  • ๋ณด๋ฉด ์ƒ˜ํ”Œ X ์ฃผ์œ„์— ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ด์›ƒ ์ƒ˜ํ”Œ 10๊ฐœ๋ฅผ ํ‘œ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์‚ผ๊ฐํ˜•์ด 5๊ฐœ, ์‚ฌ๊ฐํ˜•์ด 3๊ฐœ, ์› 2๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด์›ƒํ•œ ์ƒ˜ํ”Œ์˜ ํด๋ž˜์Šค๋ฅผ ํ™•๋ฅ ๋กœ ์‚ผ๋Š”๋‹ค๋ฉด ์ƒ˜ํ”Œ X๊ฐ€ ์‚ฌ๊ฐํ˜•์ด ํ™•๋ฅ ์€ 30%, ์‚ผ๊ฐํ˜•์ผ ํ™•๋ฅ ์€ 50%, ์›์ธ ํ™•๋ฅ ์€ 20%์ž…๋‹ˆ๋‹ค.
  • Scikit-learn์˜ K-์ตœ๊ทผ์ ‘ ์ด์›ƒ ๋ถ„๋ฅ˜๊ธฐ๋„ ์ด์™€ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ Class ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ํ•œ๋ฒˆ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์ค€๋น„

import pandas as pd
fish = pd.read_csv('https://bit.ly/fish_csv_data')
fish.head()
# Species(7๊ฐœ์˜ ์ƒ์„ ์— ๋Œ€ํ•œ ์ข…๋ฅ˜)-Target, ๋‚˜๋จธ์ง€๋Š” ํŠน์„ฑ ๋ฐ์ดํ„ฐ(input_data-fish_input)
print(pd.unique(fish['Species']))
['Bream' 'Roach' 'Whitefish' 'Parkki' 'Perch' 'Pike' 'Smelt']
  • ์ด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ Species ์—ด์„ ํƒ€๊นƒ์œผ๋กœ ๋งŒ๋“ค๊ณ  ๋‚˜๋จธ์ง€ 5๊ฐœ ์—ด์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
fish_input = fish[['Weight','Length','Diagonal','Height','Width']].to_numpy()
print(fish_input[:5])
[[242. 25.4 30. 11.52 4.02 ]
 [290. 26.3 31.2 12.48 4.3056]
 [340. 26.5 31.1 12.3778 4.6961]
 [363. 29. 33.5 12.73 4.4555]
 [430. 29. 34. 12.444 5.134 ]]
  • ํƒ€๊ฒŸ ๋ฐ์ดํ„ฐ๋„ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# Species(7๊ฐœ์˜ ์ƒ์„ ์— ๋Œ€ํ•œ ์ข…๋ฅ˜)-Target
fish_target = fish['Species'].to_numpy()
  • ์•ž์—์„œ ๋ฐฐ์› ๋“ฏ์ด ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ๊ธฐ๋ณธ์œผ๋กœ ๋ฐ์ดํ„ฐ ์„ธํŠธ 2๊ฐœ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
from sklearn.model_selection import train_test_split
train_input, test_input, train_target, test_target = train_test_split(fish_input, fish_target, random_state=42)
  • ๊ทธ๋‹ค์Œ Scikit-learn์˜ StandardScaler ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•ด Training_set์™€ Test_set๋ฅผ ํ‘œ์ค€ํ™” ์ฒ˜๋ฆฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์•Œ์•„์•ผ ํ• ์ ์€, Training_set์˜ ํ†ต๊ณ„๊ฐ’์œผ๋กœ Test_set๋ฅผ ๋ณ€ํ™˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
ss.fit(train_input)
train_scaled = ss.transform(train_input)
test_scaled = ss.transform(test_input)

K-์ตœ๊ทผ์ ‘ ์ด์›ƒ์˜ ๋‹ค์ค‘๋ถ„๋ฅ˜

Scikit-learn์˜ KNeighborsClassifier ํด๋ž˜์Šค ๊ฐ์ฒด๋ฅผ ๋งŒ๋“ค๊ณ  Training_set๋กœ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•œ ๋‹ค์Œ Training_set์™€ Test_set์˜ ์ ์ˆ˜๋ฅผ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ตœ๊ทผ์ ‘ ์ด์›ƒ ๊ฐœ์ˆ˜๋Š” k๋Š” 3์œผ๋กœ ์ง€์ •ํ•˜๊ณ  ์‚ฌ์šฉํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
from sklearn.neighbors import KNeighborsClassifier
kn = KNeighborsClassifier(n_neighbors=3)
kn.fit(train_scaled, train_target)
print(kn.score(train_scaled, train_target))
print(kn.score(test_scaled, test_target))
0.8907563025210085
0.85
  • ์—ฌ๊ธฐ์„œ ์ž ๊น ์งš๊ณ  ๋„˜์–ด๊ฐ€์•ผ ํ•  ๋ถ€๋ถ„์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํƒ€๊นƒ ๋ฐ์ดํ„ฐ๋ฅผ ๋งŒ๋“ค ๋•Œ fish['Species']๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋งŒ๋“ค์—ˆ๊ธฐ ๋•Œ๋ฌธ์—, Training & Test set์˜ 7๊ฐœ์˜ ์ƒ์„  ์ข…๋ฅ˜๊ฐ€ ๋“ค์–ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํƒ€๊นƒ ๋ฐ์ดํ„ฐ์˜ 2๊ฐœ ์ด์ƒ์˜ ํด๋ž˜์Šค๊ฐ€ ํฌํ•จ๋œ ๋ฌธ์ œ๋ฅผ ๋‹ค์ค‘ ๋ถ„๋ฅ˜(Multi-class Classification)๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ Classes๋“ค์˜ ๊ฐœ์ˆ˜๋ฅผ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# ์†์„ฑ ํ™•์ธ, _๋Š” ๋ชจ๋ธ์ด ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ•™์Šตํ•œ ์†์„ฑ์ด๋ผ๋Š”๊ฒƒ์„ ๋‚˜ํƒ€๋ƒ„
print(kn.classes_)
['Bream' 'Parkki' 'Perch' 'Pike' 'Roach' 'Smelt' 'Whitefish']
  • Bream์ด ์ฒซ๋ฒˆ์งธ Class, Parkki๋Š” ๋‘๋ฒˆ์งธ Class๊ฐ€ ๋˜๋Š” ์‹์ž…๋‹ˆ๋‹ค.
  • predict() Method๋Š” Target๊ฐ’์œผ๋กœ๋„ ์˜ˆ์ธก์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ํ•œ๋ฒˆ ์˜ˆ์ธกํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
print(kn.predict(test_scaled[:5]))
['Perch' 'Smelt' 'Pike' 'Perch' 'Perch']
  • ์ด 5๊ฐœ์˜ ์ƒ˜ํ”Œ์— ๋Œ€ํ•œ ์˜ˆ์ธก์€ ์–ด๋–ค ํ™•๋ฅ ๋กœ ๋งŒ๋“ค์–ด์กŒ์„๊นŒ์š”?
  • Scikit-learn์˜ ๋ถ„๋ฅ˜๋ชจ๋ธ์„ predict_, proba() method๋กœ class๋ณ„ ํ™•๋ฅ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ, Test_set์— ์žˆ๋Š” ์ฒ˜์Œ 5๊ฐœ์˜ ์ƒ˜ํ”Œ์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • Numpy์˜ round() ํ•จ์ˆ˜๋Š” ๊ธฐ๋ณธ์œผ๋กœ ์†Œ์ˆ˜์  ์ฒซ์งธ ์ž๋ฆฌ์—์„œ ๋ฐ˜์˜ฌ๋ฆผ์„ ํ•˜๋Š”๋ฐ, decimals ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์œ ์ง€ํ•  ์†Œ์ˆ˜์  ์•„๋ž˜ ์ž๋ฆฟ์ˆ˜๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
# ํ™•๋ฅ  ์ถœ๋ ฅ์‹œ predict_proba method ์‚ฌ์šฉ
# 5๊ฐœ์˜ sample, 7๊ฐœ์˜ ์ƒ์„ 
import numpy as np
proba = kn.predict_proba(test_scaled[:5])
print(np.round(proba, decimals=4))
[[0.     0.     1.     0.     0.     0.     0.    ]
 [0.     0.     0.     0.     0.     1.     0.    ]
 [0.     0.     0.     1.     0.     0.     0.    ]
 [0.     0.     0.6667 0.     0.3333 0.     0.    ]
 [0.     0.     0.6667 0.     0.3333 0.     0.    ]]
  • predict_proba() Method์˜ ์ถœ๋ ฅ ์ˆœ์„œ๋Š” ์•ž์— ๋ณด์•˜๋˜ classes_ ์†์„ฑ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • ์ฆ‰, ์ฒซ๋ฒˆ์งธ ์—ด์ด 'Bream'์— ๋Œ€ํ•œ ํ™•๋ฅ , ๋‘๋ฒˆ์งธ๋Š” 'Parkki'์— ๋Œ€ํ•œ ํ™•๋ฅ ์ž…๋‹ˆ๋‹ค.

  • ์ด ๋ชจ๋ธ์ด ๊ณ„์‚ฐํ•œ ํ™•๋ฅ ์ด ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ด์›ƒ์ด ๋งž๋Š”์ง€ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ•œ๋ฒˆ 4๋ฒˆ์งธ ์ƒ˜ํ”Œ์˜ ์ตœ๊ทผ์ ‘ ์ด์›ƒ์˜ Class๋ฅผ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
distances, indexes = kn.kneighbors(test_scaled[3:4])
print(train_target[indexes])
[['Roach' 'Perch' 'Perch']]
  • ์ด ์ƒ˜ํ”Œ์˜ ์ด์›ƒ์€ ๋‹ค์„ฏ๋ฒˆ์งธ Class์ธ 'Roach'๊ฐ€ 1๊ฐœ์ด๊ณ , 'Perch'๊ฐ€ 2๊ฐœ์ž…๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ ๋‹ค์„ฏ ๋ฒˆ์งธ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ ์€ 1/3, ์ฆ‰ 0.333์ด๊ณ  ์„ธ๋ฒˆ์งธ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ ์€ 2/3 = 0.6667์ž…๋‹ˆ๋‹ค.
  • ์•ž์„œ ์ถœ๋ ฅํ•œ ๋„ค ๋ฒˆ์งธ ์ƒ˜ํ”Œ์˜ ํด๋ž˜์Šค ํ™•๋ฅ ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • ์ด๋Ÿฐ ๋ฒˆ๊ฑฐ๋กœ์šด ๊ณ„์‚ฐ์€ Scikit-learn์ด ์ˆ˜ํ–‰ํ•ด์ฃผ๋ฏ€๋กœ, predict_proba() Method๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด ๊ทธ๋งŒ์ž…๋‹ˆ๋‹ค.
  • ๊ทผ๋ฐ, K-์ตœ๊ทผ์ ‘ ์ด์›ƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๋ฉด ๋‚˜์˜ค๋Š” ํ™•๋ฅ ์€ 0/3, 1/3, 2/3, 3/3์ด ์ „๋ถ€์ž…๋‹ˆ๋‹ค. ๋ญ”๊ฐ€ ์ด์ƒํ•˜๊ธด ํ•˜๋„ค์š”..

Logistic Regression (๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€)

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression)๋Š” ํšŒ๊ท€์ด์ง€๋งŒ, ๋ถ„๋ฅ˜๋ชจ๋ธ ์ž…๋‹ˆ๋‹ค. ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์„ ํ˜• ํšŒ๊ตฌ์™€ ๋™์ผํ•˜๊ฒŒ ์„ ํ˜• ๋ฐฉ์ •์‹์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

Logistic Regression ์„ ํ˜•๋ฐฉ์ •์‹ ์˜ˆ์‹œ

  • ์—ฌ๊ธฐ์„œ a,b,c,d,e๋Š” ๊ฐ€์ค‘์น˜ ํ˜น์€ ๊ณ„์ˆ˜์ž…๋‹ˆ๋‹ค. ํŠน์„ฑ์€ ๋Š˜์–ด๋‚ฌ์ง€๋งŒ, ๋‹ค์ค‘ ํšŒ๊ท€(Multiple Regression)๋ฅผ ์œ„ํ•œ ์„ ํ˜• ๋ฐฉ์ •์‹๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • z์—๋Š” ์–ด๋– ํ•œ ๊ฐ’๋„ ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋‚ด๋ ค๋ฉด 0~1 (0~100%)์‚ฌ์ด ๊ฐ’์ด ๋˜์–ด์•ผ ํ•œ๋‹ค.
  • ๊ทผ๋ฐ, ๋งŒ์•ฝ์— z๊ฐ€ ํฐ ์Œ์ˆ˜๊ฐ€ ๋ ๋•Œ๋Š” 0์ด๋˜๊ณ , ํฐ ์–‘์ˆ˜๊ฐ€ ๋ ๋•Œ 1์ด ๋˜๋„๋ก ๋ด๊พธ๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์„๊นŒ์š”?
์—ฌ๊ธฐ์„œ Sigmoid ํ•จ์ˆ˜ or Logistic ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

  • Sigmoid ํ•จ์ˆ˜ ์—์„œ๋Š” ์œ„์˜ ์„ ํ˜• ๋ฐฉ์ •์‹์„ ์ด์šฉํ•ด์„œ z์˜ ์Œ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ ํ›„, ์ž์—ฐ ์ƒ์ˆ˜ e๋ฅผ ๊ฑฐ๋“ญ์ œ๊ณฑ ํ•˜๊ณ  1์„ ๋”ํ•œ ๊ฐ’์˜ ์—ญ์ˆ˜๋ฅผ ์ทจํ•ฉ๋‹ˆ๋‹ค.
  • z๊ฐ€ ๋ฌดํ•œํ•˜๊ฒŒ ํฐ ์Œ์ˆ˜์ผ ๊ฒฝ์šฐ๋Š” 0์— ๊ฐ€๊นŒ์›Œ์ง€๊ณ , z๊ฐ€ ๋ฌดํ•œํ•˜๊ฒŒ ํฐ ์–‘์ˆ˜์ผ ๊ฒฝ์šฐ์—๋Š” 1์— ๊ฐ€๊นŒ์›Œ ์ง‘๋‹ˆ๋‹ค. z๊ฐ€ 0์ด๋ฉด ๊ฐ’์€ 0.5๊ฐ€ ๋‚˜์˜ต๋‹ˆ๋‹ค.
  • Sigmoid ํ•จ์ˆ˜๋Š” ์ ˆ๋Œ€๋กœ 0~1 ์‚ฌ์ด๋ฅผ ๋ฒ—์–ด๋‚  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ 0~1 ์‚ฌ์ด ๊ฐ’์„ 0~100%๋กœ ๋ณผ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ Numpy ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ -5๊ฐ€ 5์‚ฌ์ด์— 0.1 ๊ฐ„๊ฒฉ์œผ๋กœ ๋ฐฐ์—ด z๋ฅผ ๋งŒ๋“ค๊ณ , ๋‹ค์Œ z ์œ„์น˜๋งˆ๋‹ค Sigmoid ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
import numpy as np
import matplotlib.pyplot as plt
z = np.arange(-5, 5, 0.1)
phi = 1 / (1+ np.exp(-z))
plt.plot(z, phi)
plt.xlabel('z')
plt.ylabel('phi')
plt.show()

  • ์ด ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด Sigmoid ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ 0~1 ์‚ฌ์ด๋กœ ๋ณ€ํ•˜๋Š”๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด Logistic ํšŒ๊ท€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. Scikit-learn์—์„œ๋Š” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ ๋ชจ๋ธ์ธ Logistic Regression Class๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ ํ…Œ์ŠคํŠธ๋กœ Sigmoid ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์ด 0.5๋ณด๋‹ค ํฌ๋ฉด ์–‘์„ฑ, ์ž‘์œผ๋ฉด ์Œ์„ฑ์œผ๋กœ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๊ฒŒ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# ๋ถˆ๋ฆฌ์–ธ ์ธ๋ฑ์‹ฑ, Logistic ํšŒ๊ท€๋กœ ์ด์ง„ ๋ถ„๋ฅ˜ ์ˆ˜ํ–‰
char_arr = np.array(['A','B','C','D','E'])
print(char_arr[[True, False, True, False, False]])
['A' 'C']
  • ์ž˜ ๋‚˜์˜ค๋Š”๊ฒƒ์„ ํ™•์ธํ•˜์˜€๊ณ , ์ด ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•ด์„œ ๋„๋ฏธ(Bream)์™€ ๋น™์–ด(Smelt)์˜ ํ–‰์„ ๋น„๊ต ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ณจ๋ผ๋‚ด๊ฒ ์Šต๋‹ˆ๋‹ค.
bream_smelt_indexes = (train_target == 'Bream') | (train_target == 'Smelt')
train_bream_smelt = train_scaled[bream_smelt_indexes]
target_bream_smelt = train_target[bream_smelt_indexes]
  • ๋น„๊ต์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋„๋ฏธ & ๋น™์–ด ํ–‰์„ ๋ชจ๋‘ Ture๋กœ ๋ด๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์–ด๋–ป๊ฒŒ ๊ณจ๋ผ๋‚ด๋ƒ๋ฉด, train_target == 'Bream' &  train_target == 'Smelt' ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉํ•ด์„œ ๋ฐฐ์—ด์— Bream, Smelt ์ธ๊ฒƒ์€ True, ๋‚˜๋จธ์ง€๋Š” False๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • bream_smelt_indexes ๋ฐฐ์—ด์€ ์ฝ”๋“œ๋ฅผ ๋ณด์‹œ๋ฉด ์•Œ์ˆ˜ ์žˆ์ง€๋งŒ, ๋„๋ฏธ์™€ ๋น™์–ด์ผ ๊ฒฝ์šฐ 'True' ๋‚˜๋จธ์ง€๋Š” 'False' ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ, train_scaled, train_target ๋ฐฐ์—ด์— ๋ถˆ๋ฆฌ์–ธ ์ธ๋ฑ์‹ฑ์„ ์ ์šฉํ•˜๋ฉด ๊ณจ๋ผ๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด์ œ ์ด ๋ฐ์ดํ„ฐ๋กœ Logistic ํšŒ๊ท€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. LogisticRegression(๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€) ํด๋ž˜์Šค๋Š” ์„ ํ˜•๋ชจ๋ธ์ด๋ฏ€๋กœ sklearn.linear_model ํŒจํ‚ค์ง€ ์•ˆ์— ์žˆ์Šต๋‹ˆ๋‹ค.
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(train_bream_smelt, target_bream_smelt)
  • ํ›ˆ๋ จํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด์„œ train_bream_smelt์— ์žˆ๋Š” ์ฒ˜์Œ 5๊ฐœ ์ƒ˜ํ”Œ์„ ์˜ˆ์ธกํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
print(lr.predict(train_bream_smelt[:5]))
['Bream' 'Smelt' 'Bream' 'Bream' 'Bream']
  • ๋‘๋ฒˆ์งธ Sample์„ ์ œ์™ธํ•˜๊ณ  ๋ชจ๋‘ ๋„๋ฏธ(Bream)๋กœ ์˜ˆ์ธก ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • KNeighborsClassifier์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์˜ˆ์ธก ํ™•๋ฅ ์€ predict_proba() Method์—์„œ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
  • train_bream_smelt์—์„œ ์ฒ˜์Œ 5๊ฐœ์˜ Sample์˜ ์˜ˆ์ธก ํ™•๋ฅ ์„ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# ์™ผ์ชฝ์ด ์Œ์„ฑ, ์˜ค๋ฅธ์ชฝ์ด ์–‘์„ฑ ๋ฐ์ดํ„ฐ, ์ˆœ์„œ๋Š” 'Bream' 'Smelt' 'Bream' 'Bream' 'Bream'
print(lr.predict_proba(train_bream_smelt[:5]))
[[0.99759855 0.00240145]
 [0.02735183 0.97264817]
 [0.99486072 0.00513928]
 [0.98584202 0.01415798]
 [0.99767269 0.00232731]]
๊ฐ Sample๋งˆ๋‹ค 2๊ฐœ์˜ ํ™•๋ฅ ์ด ์ถœ๋ ฅ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์™ผ์ชฝ์€ ์Œ์„ฑ(0), ์˜ค๋ฅธ์ชฝ์€ ์–‘์„ฑ(1) ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ ์ž…๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด Bream, Smelt ์ค‘์— ์–ด๋–ค๊ฒƒ์ด ์–‘์„ฑ ํด๋ž˜์Šค ์ผ๊นŒ์š”?
  • K-์ตœ๊ทผ์ ‘ ์ด์›ƒ ๋ถ„๋ฅ˜๊ธฐ์—์„ , Scikit-learn์€ Target๊ฐ’์„ ์•ŒํŒŒ๋ฒณ์ˆœ์œผ๋กœ ์ •๋ ฌํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. classes_ ์†์„ฑ์„ ์‚ฌ์šฉํ•˜๋ฉด ํ™•์ธ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
print(lr.classes_)
['Bream' 'Smelt']
  • ๋น™์–ด(Smelt)๊ฐ€ ์–‘์„ฑ ํด๋ž˜์Šค ๋ผ๊ณ  ๋‚˜์˜ต๋‹ˆ๋‹ค. predict_proba() Method๊ฐ€ ๋ฐ˜ํ™˜ํ•œ ๋ฐฐ์—ด๊ฐ’์„ ๋ณด๋ฉด ๋‘๋ฒˆ์งธ Sample์—์„œ๋งŒ ์–‘์„ฑ์ธ ๋น™์–ด์˜ ํ™•๋ฅ ์ด ๋†’์Šต๋‹ˆ๋‹ค. ๋‚˜๋จธ์ง€๋Š” ๋ชจ๋‘ ๋„๋ฏธ(Bream)์ด๊ฒ ๊ตฐ์š”.
  • ๊ทธ๋Ÿฌ๋ฉด, Logistic Regression์œผ๋กœ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ํ–ˆ์œผ๋‹ˆ๊นŒ, ์„ ํ˜•ํšŒ๊ท€์—์„œ ํ–ˆ๋˜๊ฒƒ ์ฒ˜๋Ÿผ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๊ฐ€ ํ•™์Šตํ•œ ๊ณ„์ˆ˜๋ฅผ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# z = -0.404 x ๋ฌด๊ฒŒ - 0.576 x ๊ธธ์ด - 0.663 x ๋Œ€๊ฐ์„  - 0.013 x ๋†’์ด - 0.732 x ๋‘๊ป˜ - 2.161
print(lr.coef_, lr.intercept_)
[[-0.4037798 -0.57620209 -0.66280298 -1.01290277 -0.73168947]] [-2.16155132]
  • Logistic ํšŒ๊ท€ ๋ชจ๋ธ์ด ํ•™์Šตํ•œ ๋ฐฉ์ •์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฐฉ์ •์‹์„ ๋ณด๋ฉด, Logistic ํšŒ๊ท€๋Š” ์„ ํ˜•ํšŒ๊ท€๋ž‘ ๋น„์Šทํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด LogisticRegression ๋ชจ๋ธ๋กœ z๊ฐ’ ๊ณ„์‚ฐ์ด ๊ฐ€๋Šฅํ• ๊นŒ์š”?
  • LogisticRegression Class๋Š” decision_function() Method๋กœ z๊ฐ’์„ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • train_bream_smelt์˜ ์ฒ˜์Œ 5๊ฐœ์˜ sample z๊ฐ’์„ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
decisions = lr.decision_function(train_bream_smelt[:5])
print(decisions)
[-6.02927744 3.57123907 -5.26568906 -4.24321775 -6.0607117 ]
  • ์ด z๊ฐ’์„ Sigmoid ํ•จ์ˆ˜์— ๋„ฃ์œผ๋ฉด ํ™•๋ฅ ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Python์˜ scipy(์‚ฌ์ดํŒŒ์ด) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—๋„ Sigmoid ํ•จ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. expit() ์ด๋ผ๋Š” Method ์ž…๋‹ˆ๋‹ค.
  • np.exp() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ถ„์ˆ˜ ๊ณ„์‚ฐ์„ ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์•ˆ์ „ํ•ฉ๋‹ˆ๋‹ค. decision ๋ฐฐ์—ด์˜ ๊ฐ’์„ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# ๊ณ„์‚ฐ์‹œ, z๊ฐ’์€ ์–‘์„ฑ class์— ๋Œ€ํ•œ z๊ฐ’๋งŒ ๊ณ„์‚ฐ. ์„ ํ˜•ํ•จ์ˆ˜๋Š” ํ•˜๋‚˜. ์–‘์„ฑ class์— ๋Œ€ํ•œ z๊ฐ’๋งŒ ๊ณ„์‚ฐ
from scipy.special import expit
print(expit(decisions))
[0.00240145 0.97264817 0.00513928 0.01415798 0.00232731]
  • ์ถœ๋ ฅ๋œ ๊ฐ’๋“ค์„ ๋ณด๋ฉด predict_proba() method ์ถœ๋ ฅ์˜ ๋‘ ๋ฒˆ์งธ ์—ด(0.97264817)์˜ ๊ฐ’๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
  • ์ฆ‰, decision_function() Method๋Š” ์–‘์„ฑ Class์— ๋Œ€ํ•œ z ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

Logistic Regression(๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€-๋‹ค์ค‘ ๋ถ„๋ฅ˜)

Logistic Regression์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ฆฟ์ง€ ํšŒ๊ท€์™€ ๊ฐ™์ด ๊ณ„์ˆ˜์˜ ์ œ๊ณฑ์„ ๊ทœ์ œํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๊ทœ์ œ๋ฅผ L2 ๊ทœ์ œ๋ผ๊ณ ๋„ ๋ถ€๋ฆ…๋‹ˆ๋‹ค.
  • ๋ฆฟ์ง€ ํšŒ๊ท€์—์„œ๋Š” alpha ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ๊ทœ์ œ์˜ ์–‘์„ ์กฐ์ ˆํ•ฉ๋‹ˆ๋‹ค. alpha๊ฐ€ ์ปค์ง€๋ฉด ๊ทœ์ œ๊ฐ€ ์ปค์ง‘๋‹ˆ๋‹ค.
  • Logistic Regression์—์„œ ๊ทœ์ œ๋ฅผ ์ œ์–ดํ•˜๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” c ์ž…๋‹ˆ๋‹ค.
  • ํ•˜์ง€๋งŒ, c๋Š” alpha์™€ ๋ฐ˜๋Œ€๋กœ ์ž‘์„์ˆ˜๋ก ๊ทœ์ œ๊ฐ€ ์ปค์ง‘๋‹ˆ๋‹ค. c์˜ ๊ธฐ๋ณธ๊ฐ’์€ 1์ด์ง€๋งŒ, ๊ทœ์ œ ์™„ํ™”๋ฅผ ์œ„ํ•ด์„œ 20์œผ๋กœ ๋Š˜๋ ค์„œ ๊ณ„์‚ฐํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
lr = LogisticRegression(C=20, max_iter=1000)
lr.fit(train_scaled, train_target)

print(lr.score(train_scaled, train_target))
print(lr.score(test_scaled, test_target))
0.9327731092436975
0.925
  • ์ด๋ฒˆ์—๋Š” Training_set์™€ Test_set์— ๋Œ€ํ•œ ์ ์ˆ˜๊ฐ€ ๋†’๊ณ , ๊ณผ๋Œ€์ ํ•ฉ(Overfitting)์ด๋‚˜ ๊ณผ์†Œ์ ํ•ฉ(Underfitting)์œผ๋กœ ์น˜์šฐ์นœ๊ฒƒ ๊ฐ™์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ๋‹ค์Œ์œผ๋กœ, Test_set์˜ ์ฒ˜์Œ 5๊ฐœ Sample์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’์„ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
print(lr.predict(test_scaled[:5]))
['Perch' 'Smelt' 'Pike' 'Roach' 'Perch']
  • ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ์ฒ˜์Œ 5๊ฐœ sample์— ๋Œ€ํ•œ ์˜ˆ์ธก ํ™•๋ฅ ์„ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์†Œ์ˆซ์  ๋„ค๋ฒˆ์งธ ์ž๋ฆฌ์—์„œ ๋ฐ˜์˜ฌ๋ฆผ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
proba = lr.predict_proba(test_scaled[:5])
print(np.round(proba, decimals=3))
[[0.    0.014 0.841 0.    0.136 0.007 0.003]
 [0.    0.003 0.044 0.    0.007 0.946 0.   ]
 [0.    0.    0.034 0.935 0.015 0.016 0.   ]
 [0.011 0.034 0.306 0.007 0.567 0.    0.076]
 [0.    0.    0.904 0.002 0.089 0.002 0.001]]
  • 5๊ฐœ ์ƒ˜ํ”Œ์— ๋Œ€ํ•œ ์˜ˆ์ธก์ด๋ฏ€๋กœ 5๊ฐœ์˜ ํ—น์ด ์ถœ๋ ฅ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
  • ์œ„์—๋ถ€ํ„ฐ ํ–‰์œผ๋กœ 'Perch', 'Smelt', 'Pike', 'Roach', 'Perch' ์ˆœ์ž…๋‹ˆ๋‹ค.
print(lr.classes_)
['Bream' 'Parkki' 'Perch' 'Pike' 'Roach' 'Smelt' 'Whitefish']
  • print๋กœ classes_ ๊ฐ์ฒด๋ฅผ ์ถœ๋ ฅํ•ด๋ณด๋ฉด ํด๋ž˜์Šค์˜ ์ •๋ณด๋ฅผ ํ™•์ธํ• ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์—ด๋กœ๋Š” 'Bream', 'Parkki', 'Perch', 'Pike', 'Roach', 'Smelt', 'Whitefish' ์ˆœ์„œ์ž…๋‹ˆ๋‹ค.
๋ณด๋ฉด, ์ฒซ๋ฒˆ์งธ sample์€ 'Perch'๋กœ ๊ฐ€์žฅ ๋†’์€ ํ™•๋ฅ ๋กœ ์˜ˆ์ธก์„ ํ•˜์˜€๊ณ , 3๋ฒˆ์งธ sample์€ 'Pike'๋กœ ๋†’์€ ํ™•๋ฅ ๋กœ ์˜ˆ์ธกํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด ๋‹ค์ค‘ ๋ถ„๋ฅ˜์ธ ๊ฒฝ์šฐ ์„ ํ˜• ๋ฐฉ์ •์‹์€ ์–ด๋–ค ๋ชจ์Šต์ผ๊นŒ์š”? coef_ ์™€ intercept_์˜ ํฌ๊ธฐ๋ฅผ ์ถœ๋ ฅํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
print(lr.coef_.shape, lr.intercept_.shape)
(7, 5) (7,)
  • ์ด ๋ฐ์ดํ„ฐ๋Š” 5๊ฐœ์˜ ํŠน์„ฑ(sample)์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ coef_๋ฐฐ์—ด์˜ ์—ด์€ 5๊ฐœ, ํ–‰์€ 7๊ฐœ ์ž…๋‹ˆ๋‹ค, intercept_๋„ 7๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ฆ‰, ์ด๋ง์€ ์ด์ง„ ๋ถ„๋ฅ˜์—์„œ ๋ณด์•˜๋˜ z๋ฅผ 7๊ฐœ๋‚˜ ๊ณ„์‚ฐํ•œ๋‹ค๋Š” ๋ง์ž…๋‹ˆ๋‹ค.
  • ๋‹ค์ค‘ ๋ถ„๋ฅ˜๋Š” ํด๋ž˜์Šค๋งˆ๋‹ค z๊ฐ’์„ 1๊ฐœ์”ฉ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด์ค‘ ๊ฐ€์žฅ ๋†’์€ z ๊ฐ’์„ ์ถœ๋ ฅํ•˜๋Š” ํด๋ž˜์Šค๊ฐ€ ์˜ˆ์ธก ํด๋ž˜์Šค๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด ์—ฌ๊ธฐ์„œ ํ™•๋ฅ ์€ ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐํ• ๊นŒ์š”? ์ด์ง„๋ถ„๋ฅ˜๋Š” Sigmoid ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด z๋ฅผ 0~1์‚ฌ์ด ํ™•๋ฅ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๋‹ค์ค‘๋ถ„๋ฅ˜๋Š” ๊ทธ๋ƒฅ ์ด์ง„๋ถ„๋ฅ˜์ฒ˜๋Ÿผ ๋˜‘๊ฐ™์ด Sigmoid ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ํ™•๋ฅ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜์ง€๋งŒ. z๊ฐ’์˜ ๊ฐœ์ˆ˜๋งŒํผ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

Softmax ํ•จ์ˆ˜

Softmax ํ•จ์ˆ˜๋Š” z๊ฐ’์„ ์ง€์ˆ˜ํ•จ์ˆ˜์— ์ ์šฉํ•˜์—ฌ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๊ณ„์‚ฐ๋ฐฉ์‹์„ ํ’€์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • z๊ฐ’ ๊ฐœ์ˆ˜์˜ ์ด๋ฆ„์„ z1, z2, ... z7๊นŒ์ง€ ๋ถ™์ด๊ฒ ์Šต๋‹ˆ๋‹ค. (z๊ฐ’์€ 7๊ฐœ)
  • ๊ทธ๋ฆฌ๊ณ  z1~z7๊นŒ์ง€ ๊ฐ’์„ ์‚ฌ์šฉํ•ด ์ง€์ˆ˜ํ•จ์ˆ˜๋ฅผ ๋ชจ๋‘ ๋”ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค ๋”ํ•œ ๊ฐ’์€ e_sum์ด๋ผ๊ณ  ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ๊ฐ๊ฐ์˜ ๊ฐ’์„ e_sum์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

  • ์ด์ œ ์ด์ง„ ๋ถ„๋ฅ˜์—์„œ decision_function() Method๋กœ z1~z7 ๊นŒ์ง€์˜ ๊ฐ’์„ ๊ตฌํ•œ ๋‹ค๋ฆ„ Softmax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ํ™•๋ฅ ๋กœ ๋ด๊พธ์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • Test_set์˜ ์ฒ˜์Œ 5๊ฐœ ์ƒ˜ํ”Œ์— ๋Œ€ํ•œ z1~z7๊นŒ์ง€์˜ ๊ฐ’์„ ๊ตฌํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# z๊ฐ’ ์ถœ๋ ฅ, ์„ ํ˜•ํ•จ์ˆ˜ 7๊ฐœ(๋‹ค์ค‘๋ถ„๋ฅ˜) - 7๊ฐœ์˜ ์ถœ๋ ฅ. 5๊ฐœ์˜ sample์— ๋Œ€ํ•˜์—ฌ 7๊ฐœ์˜ ๊ฒฐ์ •ํ•จ์ˆ˜(z)๊ฐ’ ์ถœ๋ ฅ
decision = lr.decision_function(test_scaled[:5])
print(np.round(decision, decimals=2))
[[ -6.5    1.03   5.16  -2.73   3.34   0.33  -0.63]
 [-10.86   1.93   4.77  -2.4    2.98   7.84  -4.26]
 [ -4.34  -6.23   3.17   6.49   2.36   2.42  -3.87]
 [ -0.68   0.45   2.65  -1.19   3.26  -5.75   1.26]
 [ -6.4   -1.99   5.82  -0.11   3.5   -0.11  -0.71]]
  • ๋˜ํ•œ scipy๋„ Softmax ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. spicy.special ์•„๋ž˜์— softmax()ํ•จ์ˆ˜๋ฅผ importํ•ด ์‚ฌ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
from scipy.special import softmax
proba = softmax(decision, axis=1)
print(np.round(proba, decimals=3))
[[0.    0.014 0.841 0.    0.136 0.007 0.003]
 [0.    0.003 0.044 0.    0.007 0.946 0.   ]
 [0.    0.    0.034 0.935 0.015 0.016 0.   ]
 [0.011 0.034 0.306 0.007 0.567 0.    0.076]
 [0.    0.    0.904 0.002 0.089 0.002 0.001]]
  • ์•ž์„œ ๊ตฌํ•œ decision ๋ฐฐ์—ด์„ softmax() ํ•จ์ˆ˜์— ์ „๋‹ฌํ–ˆ์Šต๋‹ˆ๋‹ค.
  • softmax()์˜ axis ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” softmax๋ฅผ ๊ณ„์‚ฐํ•  ์ถ•์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” axis=1๋กœ ์ง€์ •ํ•˜์—ฌ ๊ฐ ํ–‰, ์ฆ‰ ๊ฐ sample์— ๋Œ€ํ•ด softmax๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • ๋งŒ์•ฝ, axis ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด? ๋ฒ ์—ด ์ „์ฒด์— ๋Œ€ํ•ด softmax๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • ๋˜ํ•œ, proba ๋ฐฐ์—ด๊ณผ ๋น„๊ตํ•ด๋„ ๊ฒฐ๊ณผ๊ฐ€ ์ผ์น˜ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ๋งž๊ฒŒ ํ›ˆ๋ จํ•œ ๊ฒƒ์ด ๋งž์Šต๋‹ˆ๋‹ค.

Keywords

  • ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ์„ ํ˜• ๋ฐฉ์ •์‹์„ ์‚ฌ์šฉํ•œ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค. ์„ ํ˜• ํšŒ๊ท€์™€ ๋‹ฌ๋ฆฌ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋‚˜ ์†Œํ”„ํŠธ๋งฅ์Šค ํž˜์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋ž˜์Šค ํ™•๋ฅ ์„ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹ค์ค‘ ๋ถ„๋ฅ˜๋Š” ํƒ€๊นƒ ํด๋ž˜์Šค๊ฐ€ 2๊ฐœ ์ด์ƒ์ธ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ๋””์ค‘ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
  • ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋Š” ์„ ํ˜• ๋ฐฉ์ •์‹์˜ ์ถœ๋ ฅ์„ O๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ์••์ถ•ํ•˜๋ฉฐ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜๋Š” ๋””์ค‘ ๋ถ„๋ฅ˜์—์„œ ์—ฌ๋Ÿฌ ์„ ํ˜• ๋ฐฉ์ •์‹์˜ ์ถœ๋ ฅ ๊ฒฐ๊ดด๋ฅผ ์ •๊ทœํ™”ํ•˜์—ฌ ํ•ฉ์ด 1์ด ๋˜๋„๋ก ๋งŒ๋“ญ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ํŒจํ‚ค์ง€์™€ ํ•จ์ˆ˜

scikit-learn

  • Logistic Regression์€ ์„ ํ˜• ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ ๋กœ์ง€์Šคํ‡ด ํšŒ๊ท€๋ฅผ ์œ„ํ•œ ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค.
  • solver ๋งค๊ฐœ๋ณ€์ˆ˜์—์„œ ์‹œ์šฉํ•  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์„ ํƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ โ€˜lbfgsโ€™์ž…๋‹ˆ๋‹ค. ์‚ฌ์ดํ‚ท ๋Ÿฐ 0.17 ๋ฒ„์ „์— ์ถ”๊ฐ€๋œ โ€˜sagโ€™๋Š” ํ™•๋ฅ ์  ํ‰๊ท  ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ํŠน์„ฑ๊ณผ ์ƒ˜ํ”Œ ์ˆ˜๊ฐ€ ๋งŽ์„ ๋•Œ ์„ฑ๋Šฅ์€ ๋น ๋ฅด๊ณ  ์ข‹์Šต๋‹ˆ๋‹ค. ์‚ฌ์ดํ‚ท๋Ÿฐ 0.19 ๋ฒ„์ „์—๋Š” โ€˜sagโ€™์˜ ๊ฐœ์„  ๋ฒ„์ „์ธ โ€˜sagaโ€™๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
  • penalty ๋งค๊ฐœ๋ณ€์ˆ˜์—์„œ L2 ๊ทœ์ œ(๋ฆฟ์ง€ ๋ฐฉ์‹)์™€ L1 ๊ทœ์ œ(๋ผ์˜๋ฐฉ์‹)๋ฅผ์„ ํƒํ• ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ L2 ๊ทœ์ œ๋ฅผ ์˜๋ฏธํ•˜๋Š” โ€˜12โ€™์ž…๋‹ˆ๋‹ค.
  • c ๋งค๊ฐœ๋ณ€์ˆ˜์—์„œ ๊ทœ์ œ์˜ ๊ฐ•๋„๋ฅผ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ 1.0์ด๋ฉฐ ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ๊ทœ์ œ๊ฐ€ ๊ฐ•ํ•ด์ง‘ ๋‹ˆ๋‹ค.
  • predict_proba() ๋ฉ”์„œ๋“œ๋Š” ์˜ˆ์ธก ํ™•๋ฅ ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด์ง„ ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ์—๋Š” ์ƒ˜ํ”Œ๋งˆ๋‹ค ์Œ์„ฑ ํด๋ž˜์Šค์™€ ์–‘์„ฑ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์ค‘ ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ์—๋Š” ์ƒ˜ํ”Œ๋งˆ๋‹ค ๋ชจ๋“  ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํš๋ฅ ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • decision function()์€ ๋ชจ๋ธ์ด ํ•™์Šตํ•œ ์„ ํ˜•๋ฐฉ์ •์‹์˜ ์ถœ๋ ฅ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด์ง„ ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ ์–‘์„ฑ ํด๋ž˜์Šค์˜ ํš๋ฅ ์ด ๋ฐ˜ํ™˜ ๋ฉ๋‹ˆ๋‹ค. ์ด ๊ฐ’์ด 0๋ณด๋‹ค ํฌ๋ฉด ์–‘์„ฑ ํด๋ž˜์Šค, ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์œผ๋ฉด ์Œ์„ฑ ํด๋ž˜์Šค๋กœ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์ค‘๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ ๊ฐํด๋ž˜์Šค๋งˆ๋‹ค ์„ ํ˜• ๋ฐฉ์ •์‹์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๊ฐ€์žฅ ํฐ ๊ฐ’์˜ ํด๋ž˜์Šค๊ฐ€ ์˜ˆ์ธก ํด๋ž˜์Šค๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.