A A
[DL] Neural Networks (์‹ ๊ฒฝ๋ง)
์ด๋ฒˆ์—๋Š” Neural Network, ์‹ ๊ฒฝ๋ง์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • Neural Network(์‹ ๊ฒฝ๋ง)์€ ์ธ๊ณต์ง€๋Šฅ, ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์ปดํ“จํŒ… ์‹œ์Šคํ…œ์˜ ๋ฐฉ๋ฒ•์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.
  • ์ธ๊ฐ„ ๋˜๋Š” ๋™๋ฌผ์˜ ๋‡Œ์— ์žˆ๋Š” ์ƒ๋ฌผํ•™์  ์‹ ๊ฒฝ๋ง์—์„œ ์˜๊ฐ์„ ๋ฐ›์•„ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
  • ์ƒ๋ฌผํ•™์  ๋‰ด๋Ÿฐ์ด ์„œ๋กœ๊ฐ„์˜ ์‹ ํ˜ธ๋ฅผ ๋ณด๋‚ด๋Š” ๋ฐฉ์‹์„ ๋ชจ๋ฐฉํ•ฉ๋‹ˆ๋‹ค.

Perceptron (ํผ์…‰ํŠธ๋ก )๊ณผ Neural Network(์‹ ๊ฒฝ๋ง)

Perceptron(ํผ์…‰ํŠธ๋ก )๊ณผ Neural Network(์‹ ๊ฒฝ๋ง)์€ ๊ณตํ†ต์ ์ด ๋งŽ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋‹ค๋ฅธ์ ์„ ์ค‘์ ์œผ๋กœ ๋ณด๋ฉด์„œ ์„ค๋ช…ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Neural Network(์‹ ๊ฒธ๋ง) ์˜ˆ์‹œ ๊ทธ๋ฆผ

  • ์‹ ๊ฒธ๋ง๋ฅผ ๊ทธ๋ฆผ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋ฉด ์œ„์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๋‚˜์˜ต๋‹ˆ๋‹ค.
  • ๋งจ ์™ผ์ชฝ์€ Input Layer(์ž…๋ ฅ์ธต), ์ค‘๊ฐ„์ธต์€ Hidden layer(์€๋‹‰์ธต), ์˜ค๋ฅธ์ชฝ์€ Output Layer(์ถœ๋ ฅ์ธต)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • Hidden layer(์€๋‹‰์ธต)์˜ Neuron(๋‰ด๋Ÿฐ)์€ ์‚ฌ๋žŒ ๋ˆˆ์—๋Š” ๋ณด์ด์ง€ ์•Š์•„์„œ '์€๋‹‰' ์ด๋ผ๋Š” ํ‘œํ˜„์„ ์”๋‹ˆ๋‹ค.
  • ์œ„์˜ ๊ทธ๋ฆผ์—๋Š” ์ž…๋ ฅ์ธต -> ์ถœ๋ ฅ์ธต ๋ฐฉํ–ฅ์œผ๋กœ 0~2์ธต์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด์œ ๋Š” Python ๋ฐฐ์—ด Index๋„ 0๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋ฉฐ, ๋‚˜์ค‘์— ๊ตฌํ˜„ํ•  ๋•Œ ์ง์ง“๊ธฐ ํŽธํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
์œ„์˜ ์‹ ๊ฒธ๋ง์€ 3๊ฐœ์˜ Layer(์ธต)์œผ๋กœ ๊ตฌ์„ฑ๋˜์ง€๋งŒ, Weight(๊ฐ€์ค‘์น˜)๋ฅผ ๊ฐ€์ง€๋Š” Layer(์ธต)๋Š” 2๊ฐœ์—ฌ์„œ '2์ธต ์‹ ๊ฒฝ๋ง' ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
* ์ด ๊ธ€์—์„œ๋Š” ์‹ค์ œ๋กœ Weight(๊ฐ€์ค‘์น˜)๋ฅผ ๊ฐ€์ง€๋Š” Layer(์ธต)์˜ ๊ฐœ์ˆ˜ [Input Layer(์ž…๋ ฅ์ธต), Hidden layer(์€๋‹‰์ธต), Output Layer(์ถœ๋ ฅ์ธต)
  • ์ด๋ ‡๊ฒŒ ์‹ ๊ฒฝ๋ง์— ๋ฐํ•œ ์„ค๋ช…์„ ๋๋‚ด๊ณ , ํ•œ๋ฒˆ ์‹ ๊ฒฝ๋ง์—์„œ ์‹ ํ˜ธ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ณด๋‚ด๋Š”์ง€ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Perceptron(ํผ์…‰ํŠธ๋ก ) ๋ณต์Šต

Neural Network(์‹ ๊ฒธ๋ง)์˜ ์‹ ํ˜ธ ์ „๋‹ฌ ๋ฐฉ๋ฒ•์„ ๋ณด๊ธฐ ์ „์— ํ•œ๋ฒˆ ๋‹ค์‹œ Perceptron(ํผ์…‰ํŠธ๋ก )์„ ๋ณด๊ฒŸ์Šต๋‹ˆ๋‹ค.

Perceptron(ํผ์…‰ํŠธ๋ก )

  • ์œ„์˜ Perceptron(ํผ์…‰ํŠธ๋ก )์€ x1, x2๋ผ๋Š” ๋‘ ์‹ ํ˜ธ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„์„œ y๋ฅผ ์ถœ๋ ฅํ•˜๋Š” Perceptron(ํผ์…‰ํŠธ๋ก ) ์ž…๋‹ˆ๋‹ค.
  • ์ˆ˜์‹ํ™” ํ•˜๋ฉด ์•„๋ž˜์˜ ์‹๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Perceptron์„ ์ˆ˜์‹ํ™”

  • ์—ฌ๊ธฐ์„œ b๋Š” bias(ํŽธํ–ฅ)์„ ๋‚˜ํƒ€๋‚ด๋Š” parameter(๋งค๊ฐœ๋ณ€์ˆ˜)์ด๋ฉฐ, Neuron(๋‰ด๋Ÿฐ)์ด ์–ผ๋งˆ๋‚˜ ํ™œ์„ฑํ™”๋˜๋Š๋ƒ๋ฅผ ์ œ์–ดํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
  • w1, w2๋Š” ๊ฐ ์‹ ํ˜ธ์˜ Weight(๊ฐ€์ค‘์น˜)๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ, ๊ฐ ์‹ ํ˜ธ์˜ ์˜ํ–ฅ๋ ฅ์„ ์ œ์–ดํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋งŒ์•ฝ์— ์—ฌ๊ธฐ์„œ bias(ํŽธํ–ฅ)์„ ํ‘œ์‹œํ•œ๋‹ค๋ฉด ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Bias(ํŽธํ–ฅ)์„ ๋ช…์‹œํ•œ ํผ์…‰ํŠธ๋ก 

  • ์œ„์˜ Perceptron(ํผ์…‰ํŠธ๋ก )์€ Weight(๊ฐ€์ค‘์น˜)๊ฐ€ b์ด๊ณ , ์ž…๋ ฅ์ด 1์ธ Neuron(๋‰ด๋Ÿฐ)์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
  • ์ด Perceptron(ํผ์…‰ํŠธ๋ก )์˜ ๋™์ž‘์€ x1, x2, 1์ด๋ผ๋Š” 3๊ฐœ์˜ ์‹ ํ˜ธ๊ฐ€ Neuron(๋‰ด๋Ÿฐ)์— ์ž…๋ ฅ๋˜์–ด, ๊ฐ ์‹ ํ˜ธ์— Weight(๊ฐ€์ค‘์น˜)๋ฅผ ๊ณฑํ•œ ํ›„, ๋‹ค์Œ Neuron(๋‰ด๋Ÿฐ)์— ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค.
  • ๋‹ค์Œ Neuron(๋‰ด๋Ÿฐ)์—์„œ๋Š” ์ด ์‹ ํ˜ธ๋“ค์˜ ๊ฐ’์„ ๋”ํ•˜์—ฌ, ๊ทธ ํ•ฉ์ด 0์„ ๋„˜์œผ๋ฉด 1์„ ์ถœ๋ ฅํ•˜๊ณ , ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
  • ์ฐธ๊ณ ๋กœ, bias(ํŽธํ–ฅ)์˜ input ์‹ ํ˜ธ๋Š” ํ•ญ์ƒ 1์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋ฆผ์—์„œ๋Š” ํ•ด๋‹น Neuron(๋‰ด๋Ÿฐ)์€ ๋‹ค๋ฅธ ๋‰ด๋Ÿฐ์˜ ์ƒ‰์„ ๋‹ค๋ฅด๊ฒŒ ํ•˜์—ฌ์„œ ๊ตฌ๋ณ„ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์œ„์˜ Perceptron(ํผ์…‰ํŠธ๋ก )์„ ์ˆ˜์‹ํ™” ํ•˜๋ฉด ์•„๋ž˜์˜ ์‹๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Perceptron(ํผ์…‰ํŠธ๋ก )์ˆ˜์‹ํ™” - Bias(ํŽธํ–ฅ) ์ถ”๊ฐ€

  • ์œ„์˜ ์™ผ์ชฝ์˜ ์‹์€ ์กฐ๊ฑด ๋ถ„๊ธฐ์˜ ๋™์ž‘ - 0์„ ๋„˜์œผ๋ฉด 1์„ ์ถœ๋ ฅ, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์„ ์ถœ๋ ฅํ•˜๋Š”๊ฒƒ์„ ํ•˜๋‚˜์˜ ํ•จ์ˆ˜๋กœ ๋‚˜ํƒ€๋‚ด์—ˆ์Šต๋‹ˆ๋‹ค.
  • ์˜ค๋ฅธ์ชฝ์˜ ์‹์€ ์™ผ์ชฝ์˜ ์ˆ˜์‹์„ ํ•˜๋‚˜์˜ ํ•จ์ˆ˜ h(x)๋ผ๊ณ  ํ•˜๊ณ , ์ž…๋ ฅ ์‹ ํ˜ธ์˜ ์ดํ•ฉ์ด ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์ณ ๋ฐ˜ํ™˜๋œํ›„, ๊ทธ ๋ณ€ํ™˜๋œ ๊ฐ’์ด y์˜ ์ถœ๋ ฅ์ด ๋จ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ํ•จ์ˆ˜์˜ ์ž…๋ ฅ์ด 0์„ ๋„˜์œผ๋ฉด 1์„ ์ถœ๋ ฅ, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์™ผ์ชฝ, ์˜ค๋ฅธ์ชฝ์˜ ์ˆ˜์‹์ด ํ•˜๋Š”์ผ์€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Perceptron(ํผ์…‰ํŠธ๋ก ) ์—์„œ์˜ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) ์ฒ˜๋ฆฌ๊ณผ์ • 

  • ์ „์˜ h(x)๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ๋ณด์…จ์Šต๋‹ˆ๋‹ค. ์ด์ฒ˜๋Ÿผ ์ž…๋ ฅ์‹ ํ˜ธ์˜ ์ดํ•ฉ์„ ์ถœ๋ ฅ ์‹ ํ˜ธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • 'ํ™œ์„ฑํ™” ํ•จ์ˆ˜' ๋Š” ์ž…๋ ฅ์‹ ํ˜ธ์˜ ์ดํ•ฉ์ด ํ™œ์„ฑํ™”๋ฅผ ์ผ์œผํ‚ค๋Š”์ง€๋ฅผ ์ •ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.

ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ˆ˜์‹

  • ์œ„์˜ ์ˆ˜์‹์€ Weight(๊ฐ€์ค‘์น˜)๊ฐ€ ๊ณฑํ•ด์ง„ ์ž…๋ ฅ ์‹ ํ˜ธ์˜ ์ดํ•ฉ์„ ๊ณ„์‚ฐํ•˜๊ณ , ๊ทธ ํ•ฉ์„ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์— ์ž…๋ ฅํ•ด ๊ฒฐ๊ณผ๋ฅผ ๋‚ด๋Š” 2๋‹จ๊ณ„๋กœ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ์‹์€ 2๋‹จ๊ณ„์˜ ์‹์œผ๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์œ„์˜ ์ˆ˜์‹์ค‘์˜ ์ฒซ๋ฒˆ์งธ ์‹(์œ„์˜ ์‹)์€ Weight(๊ฐ€์ค‘์น˜)๊ฐ€ ๋‹ฌ๋ฆฐ ์ž…๋ ฅ ์‹ ํ˜ธ์™€ Bias(ํŽธํ–ฅ)์˜ ์ดํ•ฉ์„ ๊ณ„์‚ฐํ•œ ๊ฒฐ๊ณผ๋ฅผ 'a' ๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ๋‘๋ฒˆ์งธ ์‹(์•„๋ž˜์˜ ์‹)์€ Weight(๊ฐ€์ค‘์น˜)๊ฐ€ ๋‹ฌ๋ฆฐ ์ž…๋ ฅ ์‹ ํ˜ธ์™€ Bias(ํŽธํ–ฅ)์˜ ์ดํ•ฉ์„ ๊ณ„์‚ฐํ•œ ๊ฒฐ๊ณผ์ธ 'a'๋ฅผ ํ•จ์ˆ˜ h()์— ๋„ฃ์–ด y๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ํ๋ฆ„์ž…๋‹ˆ๋‹ค.

  • Perceptron(ํผ์…‰ํŠธ๋ก )์—์„œ Neuron(๋‰ด๋Ÿฐ)์„ ํฐ ์›์œผ๋กœ ๋ณด๋ฉด ์˜ค๋ฅธ์ชฝ์˜ ์‹(ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ˆ˜์‹)์€ ํฐ Neuron(๋‰ด๋Ÿฐ)์•ˆ์—์„œ์˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ˆ˜์‹์„ ํฐ Neuron(๋‰ด๋Ÿฐ)์•ˆ์— ์ฒ˜๋ฆฌ๊ณผ์ •์„ ์‹œ๊ฐํ™” ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์ฆ‰, Weight(๊ฐ€์ค‘์น˜) ์‹ ํ˜ธ๋ฅผ ์กฐํ•ฉํ•œ ๊ฒฐ๊ณผ๊ฐ€ a๋ผ๋Š” Node(๋…ธ๋“œ), Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) h()๋ฅผ ํ†ต๊ณผํ•˜์—ฌ y๋ผ๋Š” Node(๋…ธ๋“œ)๋กœ ๋ณ€ํ™˜๋˜๋Š” ๊ณผ์ •์ด ๋‚˜ํƒ€๋‚˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์™ผ์ชฝ์€ ์ผ๋ฐ˜์ ์ธ ๋‰ด๋Ÿฐ, ์˜ค๋ฅธ์ชฝ์€ Activation(ํ™œ์„ฑํ™”) ์ฒ˜๋ฆฌ ๊ณผ์ •์„ ๋ช…์‹œํ•œ ๋‰ด๋Ÿฐ์ž…๋‹ˆ๋‹ค. (a๋Š” ์ž…๋ ฅ ์‹ ํ˜ธ์˜ ์ดํ•ฉ, h()๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜, y๋Š” ์ถœ๋ ฅ)


Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)

  • Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋Š” ์ž„๊ณ„๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ์ถœ๋ ฅ์ด ๋ด๋€๋‹ˆ๋‹ค, ์ด๋Ÿฐ ํ•จ์ˆ˜๋ฅผ Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ž˜์„œ Perceptron(ํผ์…‰ํŠธ๋ก )์—์„œ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋กœ Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)์„ ์ด์šฉํ•œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ์ฆ‰, Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์œผ๋กœ ์“ธ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌํ•จ์ˆ˜์ค‘ Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•˜๋Š”๋ฐ, ๊ทธ๋Ÿฌ๋ฉด Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)์˜์™ธ์˜ ๋‹ค๋ฅธ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ ๊นŒ์š”? ํ•œ๋ฒˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)

  • Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)๋Š” ์‹ ๊ฒฝ๋ง์—์„œ ์ž์ฃผ ์ด์šฉํ•˜๋Š” Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) ์ž…๋‹ˆ๋‹ค.

Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜) ์ˆ˜์‹

  • exp(-x)๋Š” e์˜ -x์Šน์„ ๋œปํ•˜๋ฉฐ, e๋Š” ์ž์—ฐ์ƒ์ˆ˜ 2.7182....์˜ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ์‹ค์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์‹ ๊ฒฝ๋ง์—์„œ๋Š” Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋กœ Sigmoid ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์‹ ํ˜ธ๋ฅผ ๋ณ€ํ™˜ํ•˜๊ณ , ๊ทธ ๋ณ€ํ™˜๋œ ์‹ ํ˜ธ๋ฅผ ๋‹ค์Œ ๋‰ด๋Ÿฐ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
  • ์•ž์—์„œ ๋ณธ Perceptron(ํผ์…‰ํŠธ๋ก ) & ์•ž์—์„œ ๋ณธ ์‹ ๊ฒฝ๋ง์˜ ์ฃผ๋œ ์ฐจ์ด๋Š” ์ด Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) ๋ฟ์ž…๋‹ˆ๋‹ค.
  • ๊ทธ ์™ธ์˜ Neuron(๋‰ด๋Ÿฐ)์ด ์—ฌ๋Ÿฌ์ธต์œผ๋กœ ์ด์–ด์ง€๋Š” ๊ตฌ์กฐ์™€ ์‹ ํ˜ธ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ํผ์…‰ํŠธ๋ก ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜) ๊ตฌํ˜„ํ•˜๊ธฐ

  • ์ด๋ฒˆ์—๋Š” ํ•œ๋ฒˆ Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)์„ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์ž…๋ ฅ์ด 0์„ ๋„˜์œผ๋ฉด 1์„ ์ถœ๋ ฅํ•˜๊ณ , ๊ทธ ์™ธ์—๋Š” 0์„ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
def step_function(x):
	if x > 0:
    	return 1
    else:
    	return 0
  • ์œ„์˜ ์ฝ”๋“œ์—์„œ๋Š” ์ธ์ˆ˜ x๋Š” Float(์‹ค์ˆ˜)ํ˜•๋งŒ ๋ฐ›์•„๋“ค์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ numpy ๋ฐฐ์—ด์˜ ์ธ์ˆ˜๋Š” ๋„ฃ์„์ˆ˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
  • step_function(np.array([1.0, 2.0]))๋Š” ์•ˆ๋ฉ๋‹ˆ๋‹ค. ๋งŒ์•ฝ์— numpy ๋ฐฐ์—ด๋„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์•„๋ž˜์˜ ์ฝ”๋“œ์ฒ˜๋Ÿผ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
# numpy๋กœ step_function ๊ตฌํ˜„
def step_function(x):
	y = x > 0
    return y.astype(np.int)
  • ์•„๋ž˜์˜ ์ฝ”๋“œ๋Š” x๋ผ๋Š” Numpy ๋ฐฐ์—ด์— ๋ถ€๋“ฑํ˜ธ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
>>> import numpy as np
>>> x = np.array([-1.0, 1.0, 2.0)] # Numpy ๋ฐฐ์—ด
>>> x
array([-1., 1., 2.])
>>> y = x > 0 # ๋ถ€๋“ฑํ˜ธ ์—ฐ์‚ฐ ์ˆ˜ํ–‰
>>> y
array([False, True, True], dtype=bool) # data type = boolํ˜•
  • Numpy ๋ฐฐ์—ด์— ๋ถ€๋“ฑํ˜ธ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด array ๋ฐฐ์—ด์˜ ์›์†Œ ๊ฐ๊ฐ์— ๋ถ€๋“ฑํ˜ธ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•œ bool ๋ฐฐ์—ด์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ ๋ฐฐ์—ด x์˜ ์›์†Œ ๊ฐ๊ฐ์ด 0๋ณด๋‹ค ํฌ๋ฉด True, 0์ดํ•˜๋ฉด False๋กœ ๋ณ€ํ™˜ํ•œ ์ƒˆ๋กœ์šด ๋ฐฐ์—ด y๊ฐ€ ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ y๋Š” bool ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
  • ๊ทผ๋ฐ ์—ฌ๊ธฐ์„œ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)๋Š” 0 or 1์˜ 'intํ˜•'์„ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜์ด๋ฏ€๋กœ ๋ฐฐ์—ด y์˜ ์›์†Œ๋ฅผ bool -> intํ˜•์œผ๋กœ ๋ด๊ฟ”์ค๋‹ˆ๋‹ค.
>>> y = y.astype(np.int)
>>> y
array([0, 1, 1])
  • Numpy ๋ฐฐ์—ด์˜ ์ž๋ฃŒํ˜•์„ ๋ณ€ํ™˜ํ•  ๋•Œ๋Š” astype() Method๋ฅผ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค. ์›ํ•˜๋Š” ์ž๋ฃŒํ˜•์„ ์ธ์ˆ˜๋กœ ์ง€์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
  • ๋˜ํ•œ Python์—์„œ boolํ˜•์„ intํ˜•์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ฉด True๋Š” 1๋กœ, False๋Š” 0์œผ๋กœ ๋ณ€ํ™˜๋ฉ๋‹ˆ๋‹ค.

Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)์˜ ๊ทธ๋ž˜ํ”„

  • ์•ž์—์„œ ์ •์˜ํ•œ Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ ค๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด๋•Œ ์šฐ๋ฆฌ๊ฐ€ ์ง€๋‚œ๋ฒˆ์— ๋ดค์—ˆ๋˜ Matplotlib ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
# coding: utf-8
import numpy as np
import matplotlib.pylab as plt


def step_function(x):
    return np.array(x > 0, dtype=np.int64)

X = np.arange(-5.0, 5.0, 0.1)
Y = step_function(X)
plt.plot(X, Y)
plt.ylim(-0.1, 1.1)  # y์ถ•์˜ ๋ฒ”์œ„ ์ง€์ •
plt.show()
  • np.arange(-5.0, 5.0, 0.1)์€ -5.0 ~ 5.0๊นŒ์ง€ 0.1๊ฐ„๊ฒฉ์˜ Numpy ๋ฐฐ์—ด์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, [-5.0, -4.9 ~ 4.9]๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • Step_function()์€ ์ธ์ˆ˜๋กœ ๋ฐ›์€ Numpy ๋ฐฐ์—ด์˜ ์›์†Œ๋ฅผ ๊ฐ๊ฐ ์ธ์ˆ˜๋กœ Step function(๊ณ„์‚ฐ ํ•จ์ˆ˜)๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ๋‹ค์‹œ ๋ฐฐ์—ด๋กœ ๋ฐ›๊ณ , ๊ทธํ›„ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๋ฉด - plot ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Step Function(๊ณ„์‚ฐ ํ•จ์ˆ˜)์˜ ๊ทธ๋ž˜ํ”„

  • ์œ„์˜ ๊ทธ๋ฆผ์—์„œ ๋ณด๋ฉด Step Function - ๊ณ„๋‹จ ํ•จ์ˆ˜๋Š” 0์„ ๊ฒฝ๊ณ„๋กœ ์ถœ๋ ฅ์ด 0์—์„œ 1, ๋˜๋Š” 1์—์„œ 0์œผ๋กœ ๋ด๋€๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ ๋ด๋€Œ๋Š” ํ˜•ํƒœ๊ฐ€ ๊ณ„๋‹จ์ฒ˜๋Ÿผ ๋ณด์—ฌ์„œ Step Function ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜) ๊ตฌํ˜„ํ•˜๊ธฐ

  • ํ•œ๋ฒˆ Sigmoid Function์„ ๊ตฌํ˜„ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ•œ๋ฒˆ Python์œผ๋กœ ์ฝ”๋“œ๋ฅผ ํ•œ๋ฒˆ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
def sigmoid(x):
    return 1 / (1 + np.exp(-x))
  • ์—ฌ๊ธฐ์„œ np.exp(-x)๋Š” exp(-x) ์ˆ˜์‹์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
  • ์ธ์ˆ˜๊ฐ€ x๊ฐ€ Numpy ๋ฐฐ์—ด์ด์—ฌ๋„ ๋ฐ”๋ฅธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ๋‹ค๋Š” ๊ฒƒ๋งŒ ๊ธฐ์–ตํ•ด๋‘ก๋‹ˆ๋‹ค.
>>> x = np.array([-1.0, 1.0, 2.0])
>>> sigmoid(x)
array([0.26894142, 0.73105858, 0.88079708])
  • ์ด ํ•จ์ˆ˜๊ฐ€ Numpy ๋ฐฐ์—ด๋„ ์ฒ˜๋ฆฌํ• ์ˆ˜ ์žˆ๋Š” ์ด์œ ๋Š” Numpy์˜ Broadcast ๊ธฐ๋Šฅ์— ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์•ž์—์„œ ์„ค๋ช… ํ–ˆ์ง€๋งŒ ํ•œ๋ฒˆ ๋‹ค์‹œ ์„ค๋ช…ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

BroadCast (๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ)

  • Numpy์—์„œ๋Š” ํ˜•์ƒ์ด ๋‹ค๋ฅธ ๋ฐฐ์—ด๋ผ๋ฆฌ๋„ ๊ณ„์‚ฐ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด์„œ 2 * 2 ํ–‰๋ ฌ A์— Scaler๊ฐ’ 10์„ ๊ณฑํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋•Œ 10์ด๋ผ๋Š” Scaler๊ฐ’์ด 2 * 2 ํ–‰๋ ฌ๋กœ ํ™•๋Œ€๋œ ํ›„ ์—ฐ์‚ฐ์ด ์ด๋ค„์ง‘๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ์„ Broadcast (๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

# Broadcast Example
>>> A = np.array([[1, 2], [3, 4]])
>>> B = np.array([[10, 20]])
>>> A * B
array([[10, 40],
       [30, 80]])
  • ์œ„์˜ Broadcast์˜ ์˜ˆ์‹œ๋ฅผ ๋ณด๋ฉด, 1์ฐจ์› ๋ฐฐ์—ด์ธ B๊ฐ€ 2์ฐจ์› ๋ฐฐ์—ด์ธ A์™€ ๋˜‘๊ฐ™์€ ํ˜•์ƒ์œผ๋กœ ๋ณ€ํ˜•๋œ ํ›„ ์›์†Œ๋ณ„ ์—ฐ์‚ฐ์ด ์ง„ํ–‰๋ฉ๋‹ˆ๋‹ค.

Broadcast Example


Numpy์˜ ์›์†Œ ์ ‘๊ทผ

  • ์›์†Œ์˜ Index๋Š” 0๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. Index๋กœ ์›์†Œ์— ์ ‘๊ทผํ•˜๋Š” ์˜ˆ์‹œ๋Š” ์•„๋ž˜์˜ ์ฝ”๋“œ์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
>>> X = np.array([[51, 55], [14, 19], [0, 4]])
>>> print(X)
[[51 55]
 [14 19]
 [0   4]
>>> X[0] # 0ํ–‰
array([51, 55])
>>> X[0][1] #(0, 1) ์œ„์น˜์˜ ์›์†Œ
55
  • for๋ฌธ์œผ๋กœ๋„ ๊ฐ ์›์†Œ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
>>> for row in X:
...	    print(row)
...
[51 55]
[14 19]
[0 4]
  • ์•„๋‹ˆ๋ฉด Index๋ฅผ array(๋ฐฐ์—ด)๋กœ ์ง€์ •ํ•ด์„œ ํ•œ๋ฒˆ์— ์—ฌ๋Ÿฌ ์›์†Œ์— ์ ‘๊ทผํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋ฉด ํŠน์ • ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ์›์†Œ๋งŒ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 
>>> X = X.flatten() #x๋ฅผ 1์ฐจ์› ๋ฐฐ์—ด๋กœ ๋ณ€ํ˜ธ๋‚˜(ํ‰ํƒ„ํ™”)
>>> print(X)
[51 55 14 19 0 4]
>>> X[np.array([0, 2, 4])] # ์ธ๋ฑ์Šค๊ฐ€ 0, 2, 4๋ฒˆ์งธ์— ์žˆ๋Š” ์›์†Œ ์–ป๊ธฐ
array([51, 14, 0])
  • ์•„๋ž˜์˜ ์ฝ”๋“œ๋Š” ๋ฐฐ์—ด์—์„œ X์—์„œ 15 ์ด์ƒ์ธ ๊ฐ’๋งŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
>>> X > 15
array([True, True, False, True, False, False], dtype=bool)
>>> X[X>15]
array([51, 55, 19])
  • Numpy ๋ฐฐ์—ด์— ๋ถ€๋“ฑํ˜ธ ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒฐ๊ณผ๋Š” bool ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค. (์—ฌ๊ธฐ์„œ๋Š” X>15)
  • ์—ฌ๊ธฐ์„œ๋Š” ์ด bool ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•ด ๋ฐฐ์—ด X์—์„œ True์— ํ•ด๋‹นํ•˜๋Š” ์›์†Œ, ์ฆ‰ ๊ฐ’์ด 15๋ณด๋‹ค ํฐ ์›์†Œ๋งŒ ๊บผ๋‚ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜) ๊ทธ๋ž˜ํ”„ ๋งŒ๋“ค์–ด๋ณด๊ธฐ

  • ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๋Š” ์ฝ”๋“œ๋Š” ์•ž ์ ˆ์˜ ๊ณ„๋‹จ ํ•จ์ˆ˜ ๊ทธ๋ฆฌ๊ธฐ ์ฝ”๋“œ์™€ ๊ฑฐ์ด ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • ๋‹ค๋ฅธ ๋ถ€๋ถ„์€ y๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ sigmoid ํ•จ์ˆ˜๋กœ ๋ณ€๊ฒฝํ•œ ๊ณณ์ž…๋‹ˆ๋‹ค.
X = np.arange(-5.0, 5.0, 0.1)
Y = sigmoid(X)
plt.plot(X, Y)
plt.ylim(-0.1, 1.1)
plt.show()

Sigmoid Function (์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜) ๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”

  • ์—ฌ๊ธฐ์„œ Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜), Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)์˜ ์ฐจ์ด๋ฅผ ์ž ๊น ์„ค๋ช…ํ•ด๋ณด์ž๋ฉด
  • Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)๋Š” 0, 1 ๋‘˜์ค‘ ํ•˜๋‚˜์˜ ๊ฐ’๋งŒ ๋Œ๋ ค์ฃผ๋Š” ๋ฐ˜๋ฉด, Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)๋Š” ์‹ค์ˆ˜๋ฅผ ๋Œ๋ ค์ค€๋‹ค๋Š” ์ ์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค.
Summary: Perceptron(ํผ์…‰์Šคํ†ค)์—์„œ๋Š” Neuron(๋‰ด๋Ÿฐ)์‚ฌ์ด์— 0 or 1์ด ํ๋ฅด๋ฉด, Neural Network(์‹ ๊ฒฝ๋ง)์—์„œ๋Š” ์—ฐ์†์ ์ธ ์‹ค์ˆ˜๊ฐ€ ํ๋ฆ…๋‹ˆ๋‹ค.
  • ๊ณตํ†ต์ ์„ ๋ณด์ž๋ฉด, ๋‘˜๋‹ค ์ž…๋ ฅ์ด ์ž‘์„๋•Œ์˜ ์ถœ๋ ฅ์€ 0์— ๊ฐ€๊นŒ์›Œ์ง€๊ณ  (์•„๋‹ˆ๋ฉด 0), ์ž…๋ ฅ์ด ์ปค์ง€๋ฉด 1์— ๊ฐ€๊นŒ์›Œ์ง€๋Š” (์•„๋‹ˆ๋ฉด 1)์ด ๋˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.
  • ์ฆ‰, Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)๊ณผ Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)๋Š” ์ž…๋ ฅ์ด ์ค‘์š”ํ•˜๋ฉด ํฐ๊ฐ’, ์ค‘์š”ํ•˜์ง€ ์•Š์œผ๋ฉด ์ž‘์€๊ฐ’์„ ์ถœ๋ ฅํ•˜๋ฉฐ, ์•„๋ฌด๋ฆฌ ์ž…๋ ฅ์ด ์ž‘๊ฑฐ๋‚˜ ์ปค๋„, ์ถœ๋ ฅ์€ 0~1 ์‚ฌ์ด์ž…๋‹ˆ๋‹ค.

Non-Linear Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)

  • ๊ทธ๋ฆฌ๊ณ  Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)๊ณผ Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)์˜ ๊ณตํ†ต์ ์ด ๋˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‘˜๋‹ค Non-Linear Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)๋ผ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • Non-Linear Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)๋Š” ๋ฌธ์ž ๊ทธ๋Œ€๋กœ '์„ ํ˜•์ด ์•„๋‹Œ' ํ•จ์ˆ˜ ์ด๋ฉฐ, ์ง์„  1๊ฐœ๋กœ ๊ทธ๋ฆด์ˆ˜ ์—†๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์‹ ๊ฒฝ๋ง์—์„œ๋Š” Activation Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)๋กœ Non-Linear Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด์œ ๋Š” Linear Function(์„ ํ˜• ํ•จ์ˆ˜)๋ฅผ ์ด์šฉํ•˜๋ฉด Neural Network(์‹ ๊ฒฝ๋ง)์˜ ์ธต์„ ๊นŠ๊ฒŒ ํ•˜๋Š” ์˜๋ฏธ๊ฐ€ ์—†์–ด์ง€๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
    • ์ด์œ ๋Š” Linear Function(์„ ํ˜• ํ•จ์ˆ˜)์˜ ๋ฌธ์ œ๋Š” ์ธต์„ ์•„๋ฌด๋ฆฌ ๊นŠ๊ฒŒ ํ•ด๋„ 'Hidden Layer(์€๋‹‰์ธต)์ด ์—†๋Š” ๋„คํŠธ์›Œํฌ'๋กœ๋„ ๋˜‘๊ฐ™์€ ๊ธฐ๋Šฅ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š”๋ฐ ์žˆ์Šต๋‹ˆ๋‹ค. ์ˆ˜์‹์„ ํ•œ๋ฒˆ ๋ณด๋ฉด ์„ ํ˜•ํ•จ์ˆ˜์ธ h(x) = cx๋ฅผ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ•œ 3์ธต ๋„คํŠธ์›Œํฌ๋กœ ์˜ˆ์‹œ๋ฅผ ๋“ค์–ด๋ณด๋ฉด
y(x) = h(h(h(x))) -> y(x) = c * c * c * x ์ด๋ ‡๊ฒŒ ๊ณฑ์…‰์„ 3๋ฒˆ ์ˆ˜ํ–‰ํ•˜์ง€๋งŒ y(x) = ax์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. a = c์˜ 3์Šน์ด๋ผ๊ณ  ํ•˜๋ฉด ๋.
  • ์ด๋ ‡๊ฒŒ Hidden Layer(์€๋‹‰์ธต์ด ์—†๋Š” ๋„คํŠธ์›Œํฌ)๊ฐ€ ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์„ ํ˜•ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜๋ฉด ์—ฌ๋Ÿฌ Layer(์ธต)์œผ๋กœ ๊ตฌ์„ฑํ•˜๋Š” ์ด์ ์„ ์‚ด๋ฆด ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ž˜์„œ Layer(์ธต)๋ฅผ ์Œ“๊ณ  ์‹ถ์œผ๋ฉด Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์€ Non-Linear Function(๋น„์„ ํ˜• ํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

ReLU Function(ReLU ํ•จ์ˆ˜)

  • ์ง€๊ธˆ๊นŒ์ง€ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋กœ์„œ Step Function(๊ณ„๋‹จ ํ•จ์ˆ˜)๊ณผ Sigmoid Function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)๋ฅผ ์†Œ๊ฐœํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทผ๋ฐ, ์ตœ๊ทผ์—๋Š” Rectified Linear Unit, ReLUํ•จ์ˆ˜๋ฅผ ์ฃผ๋กœ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ReLUํ•จ์ˆ˜๋Š” ์ž…๋ ฅ์ด 0์„ ๋„˜์œผ๋ฉด ๊ทธ ์ž…๋ ฅ์„ ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅํ•˜๊ณ , 0์ดํ•˜๋ฉด 0์„ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
  • ์ˆ˜์‹์œผ๋กœ๋Š” ์•„๋ž˜์˜ ์‹์ฒ˜๋Ÿผ ์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ReLU ํ•จ์ˆ˜ ์ˆ˜์‹
ReLU ํ•จ์ˆ˜ ๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”

# ReLUํ•จ์ˆ˜ ์‹œ๊ฐํ™” ์ฝ”๋“œ

import numpy as np
import matplotlib.pyplot as plt

def relu(x):
    return np.maximum(0, x)  # ์ž…๋ ฅ x์™€ 0 ์ค‘ ๋” ํฐ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

X = np.arange(-5.0, 5.0, 0.1)
Y = ReLU(X)
plt.plot(X, Y)
plt.ylim(-1.0, 6.0)  # y์ถ•์˜ ๋ฒ”์œ„๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
plt.show()  # ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
  • ์œ„์˜ ์ฝ”๋“œ์—์„œ๋Š” Numpy์˜ Maximum ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. Maximum ํ•จ์ˆ˜๋Š” ๋‘ ์ž…๋ ฅ ์ค‘ ํฐ ๊ฐ’์„ ์„ ํƒํ•ด ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

๋‹ค์ฐจ์› ๋ฐฐ์—ด์˜ ๊ณ„์‚ฐ

  • Numpy์˜ ๋‹ค์ฐจ์› ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•œ ๊ณ„์‚ฐ๋ฒ•์„ ์ˆ™๋‹ฌํ•˜๋ฉด ์‹ ๊ฒฝ๋ง์„ ํšจ์œจ์ ์œผ๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹ค์ฐจ์› ๋ฐฐ์—ด๋„ ๊ธฐ๋ณธ์€ '์ˆซ์ž์˜ ์ง‘ํ•ฉ'์ž…๋‹ˆ๋‹ค. ์ˆซ์ž๋ฅผ ํ•œ์ค„, ์ง์‚ฌ๊ฐํ˜•, 3์ฐจ์›์œผ๋กœ ๋Š˜์—ฌ๋†“์€๊ฒƒ๋“ค๋„ ๋‹ค ๋‹ค์ฐจ์› ๋ฐฐ์—ด์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ 1์ฐจ์› ๋ฐฐ์—ด์„ ์˜ˆ์‹œ๋กœ ํ•œ๋ฒˆ ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

1์ฐจ์› ๋ฐฐ์—ด Example Code (by Python)

>>> import numpy as np
>>> A = np.array([1, 2, 3, 4])
>>> print(A)
[1 2 3 4]
>>> np.ndim(A) # ๋ฐฐ์—ด์˜ ์ฐจ์› ์ˆ˜ ํ™•์ธ
(4, )
>>> A.shape[0]
4
  • ๋ฐฐ์—ด์˜ ์ฐจ์›์ˆ˜๋Š” np.ndim() ํ•จ์ˆ˜๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ ๋ฐฐ์—ด์˜ ํ˜•์ƒ์€ Instance ๋ณ€์ˆ˜์ธ shape์œผ๋กœ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์œ„์˜ ์˜ˆ์‹œ์—์„œ๋Š” A๋Š” 1์ฐจ์› ๋ฐฐ์—ด์ด๊ณ  ์›์†Œ 4๊ฐœ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทผ๋ฐ, A.shape์ด ํŠœํ”Œ์„ ๋ฐ˜ํ™˜ํ•˜๋Š”๊ฒƒ์— ์ฃผ์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด์œ ๋Š” 1์ฐจ์› ๋ฐฐ์—ด์ด๋ผ๋„ ๋‹ค์ฐจ์› ๋ฐฐ์—ด์ผ๋•Œ์™€ ํ†ต์ผ๋œ ํ˜•ํƒœ๋กœ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  • 2์ฐจ์› ๋ฐฐ์—ด์ผ๋•Œ์˜ (4, 3), 3์ฐจ์› ๋ฐฐ์—ด์ผ ๋•Œ๋Š” (4, 3, 2)๊ฐ™์€ ํŠœํ”Œ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

2์ฐจ์› ๋ฐฐ์—ด Example Code (by Python)

>>> B = np.array([[1,2], [3,4], [5,6]])
>>> print(B)
[[1 2]
 [3 4]
 [5 6]]
>>> np.ndim(B)
2
>>> B.shape
(3, 2)
  • ์ด๋ฒˆ์—๋Š” '3 X 2' ๋ฐฐ์—ด์ธ B๋ฅผ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. 3 X 2 ๋ฐฐ์—ด์€ ์ฒ˜์Œ Dimension(์ฐจ์›)์—๋Š” ์›์†Œ๊ฐ€ 3๊ฐœ, ๋‹ค์Œ ์ฐจ์›์—๋Š” ์›์†Œ๊ฐ€ 2๊ฐœ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋•Œ ์ฒ˜์Œ ์ฐจ์›์€ 0๋ฒˆ์งธ ์ฐจ์›, ๋‹ค์Œ ์ฐจ์›์€ 1๋ฒˆ์งธ ์ฐจ์›์— ๋Œ€์‘ํ•ฉ๋‹ˆ๋‹ค. (Python index๋Š” 0๋ถ€ํ„ฐ ์‹œ์ž‘)
  • 2์ฐจ์› ๋ฐฐ์—ด์€ ํŠนํžˆ ํ–‰๋ ฌ Matrix์ด๋ผ๊ณ  ๋ถ€๋ฅด๊ณ  ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ๋ฐฐ์—ด์˜ ๊ฐ€๋กœ ๋ฐฉํ–ฅ์„ Row(ํ–‰), ์„ธ๋กœ ๋ฐฉํ–ฅ์„ Column(์—ด) ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

2์ฐจ์› ๋ฐฐ์—ด(ํ–‰๋ ฌ)์˜ ํ–‰(๊ฐ€๋กœ)์™€ ์—ด(์„ธ๋กœ)

ํ–‰๋ ฌ์˜ ๊ณฑ ๊ณ„์‚ฐ

  • ํ–‰๋ ฌ์˜ ๊ณฑ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์•„๋ž˜ ๊ธ€์— ์˜ฌ๋ผ์™€ ์žˆ์œผ๋‹ˆ๊นŒ ์ฐธ๊ณ  ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค!
 

[DL] Numpy & ํ–‰๋ ฌ์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ธฐ

Numpy๊ฐ€ ๋ญ์—์š”? Python ์—์„œ ๊ณผํ•™์  ๊ณ„์‚ฐ & ์ˆ˜์น˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•œ ํ•ต์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. NumPy๋Š” ๊ณ ์„ฑ๋Šฅ์˜ ๋‹ค์ฐจ์› ๋ฐฐ์—ด ๊ฐ์ฒด์™€ ์ด๋ฅผ ๋‹ค๋ฃฐ ๋„๊ตฌ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ˆ˜์น˜ ๊ณ„์‚ฐ์„ ์œ„ํ•œ ๋งค์šฐ ํšจ๊ณผ์ ์ธ ์ธํ„ฐ

daehyun-bigbread.tistory.com

์‹ ๊ฒฝ๋ง์—์„œ์˜ ํ–‰๋ ฌ ๊ณฑ

  • Numpy ํ–‰๋ ฌ์„ ์จ์„œ ํ–‰๋ ฌ์˜ ๊ณฑ์œผ๋กœ ์‹ ๊ฒฝ๋ง์„ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๊ฐ„๋‹จํ•œ ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

ํ–‰๋ ฌ์˜ ๊ณฒ์œผ๋กœ ์‹ ๊ฒฝ๋ง์˜ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  • ์ด ๊ตฌํ˜„์—์„œ๋„ X, W, Y์˜ ํ˜•์ƒ์„ ์ฃผ์˜ํ•ด์„œ ๋ด์•ผํ•ฉ๋‹ˆ๋‹ค.
  • ํŠนํžˆ X, W์˜ ๋Œ€์‘ํ•˜๋Š” ์ฐจ์›์˜ ์›์†Œ ์ˆ˜๊ฐ€ ๊ฐ™์•„์•ผ ํ•œ๋‹ค๋Š” ๊ฑธ ์žŠ์ง€ ๋ง์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.
>>> X = np.array([1, 2])
>>> X.shape
(2,)
>>> W = np.array([[1, 3, 5], [2, 4, 6]])
>>> print(W)
[[1 3 5]
 [2 4 6]]
>>> W.shape
(2, 3)
>>> Y = np.dot(X, W)
>>> print(Y)
[5 11 17]
  • ๋‹ค์ฐจ์› ๋ฐฐ์—ด์˜ Scaler๊ณฑ์„ ๊ณฑํ•ด์ฃผ๋Š” np.dot ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ด์ฒ˜๋Ÿผ ๋‹จ๋ฒˆ์— ๊ฒฐ๊ณผ Y๋ฅผ ๊ณ„์‚ฐ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Y์˜ ๊ฐœ์ˆ˜๊ฐ€ 100๊ฐœ๋“ , 1000๊ฐœ๋“ ...
  • ๋งŒ์•ฝ์— np.dot์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ฉด Y์˜ ์›์†Œ๋ฅผ ํ•˜๋‚˜์”ฉ ๋”ฐ์ ธ๋ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋‹ˆ๋ฉด for๋ฌธ์„ ์‚ฌ์šฉํ•ด์„œ ๊ณ„์‚ฐํ•ด์•ผ ํ•˜๋Š”๋ฐ ๊ท€์ฐฎ์Šต๋‹ˆ๋‹ค..
  • ๊ทธ๋ž˜์„œ ํ–‰๋ ฌ์˜ ๊ณฑ์œผ๋กœ ํ•œ๊บผ๋ฒˆ์— ๊ณ„์‚ฐํ•ด์ฃผ๋Š” ๊ธฐ๋Šฅ์€ ์‹ ๊ฒฝ๋ง์„ ๊ตฌํ˜„ํ•  ๋•Œ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค๊ณ  ๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3-Layer Neural Network(3์ธต ์‹ ๊ฒฝ๋ง) ๊ตฌํ˜„ํ•˜๊ธฐ

  • ์ด๋ฒˆ์—๋Š” 3์ธต ์‹ ๊ฒฝ๋ง์—์„œ ์ˆ˜ํ–‰๋˜๋Š” ์ž…๋ ฅ๋ถ€ํ„ฐ ์ถœ๋ ฅ๊นŒ์ง€์˜ ์ฒ˜๋ฆฌ(์ˆœ๋ฐฉํ–ฅ ์ฒ˜๋ฆฌ)๋ฅผ ๊ตฌํ˜„ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿด๋ ค๋ฉด Numpy & ๋‹ค์ฐจ์› ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ Numpy ๋ฐฐ์—ด์„ ์ž˜ ์“ฐ๋ฉด ์ ์€ ์ฝ”๋“œ๋กœ ์‹ ๊ฒฝ๋ง์˜ ์ˆœ๋ฐฉํ–ฅ ์ฒ˜๋ฆฌ๋ฅผ ์™„์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3์ธต ์‹ ๊ฒฝ๋ง: ์ž…๋ ฅ์ธต(0์ธต)์€ 2๊ฐœ, ์ฒซ๋ฒˆ์งธ ์€๋‹‰์ธต(1์ธต)์€ 3๊ฐœ, ๋‘๋ฒˆ์งธ ์€๋‹‰์ธต(2์ธต)์€ 2๊ฐœ, ์ถœ๋ ฅ์ธต(3์ธต)์€ 2๊ฐœ์˜ ๋‰ด๋Ÿฐ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

์‹ ๊ฒฝ๋ง์—์„œ์˜ ํ‘œ๊ธฐ๋ฒ• 

์‹ ๊ฒฝ๋ง์—์„œ์˜ ํ‘œ๊ธฐ๋ฒ•

  • Weight(๊ฐ€์ค‘์น˜)์™€ Hidden Layer(์€๋‹‰์ธต) Neuron์˜ ์˜ค๋ฅธ์ชฝ ์œ„์—๋Š” '(1)' ์ด ๋ถ™์–ด์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋Š” 1์ธต์˜ Weight(๊ฐ€์ค‘์น˜), 1์ธต์˜ Neuron(๋‰ด๋Ÿฐ)์„ ์˜๋ฏธํ•˜๋Š” ๋ฒˆํ˜ธ์ž…๋‹ˆ๋‹ค.
  • ๋˜ํ•œ Weight(๊ฐ€์ค‘์น˜)์˜ ์˜ค๋ฅธ์ชฝ ์•„๋ž˜์˜ ๋‘ ์ˆซ์ž๋Š” ๋‹ค์Œ์ธต์˜ "Neuron(๋‰ด๋Ÿฐ)-๋‹ค์Œ์ธต ๋ฒˆํ˜ธ, ์•ž์ธต Neuron(๋‰ด๋Ÿฐ)์˜ Index ์ˆซ์ž-์•ž์ธต ๋ฒˆํ˜ธ" ์ž…๋‹ˆ๋‹ค.

๊ฐ Layer(์ธต)์˜ ์‹ ํ˜ธ ์ „๋‹ฌ ๊ตฌํ˜„ํ•˜๊ธฐ

Input Layer(์ž…๋ ฅ์ธต) -> 1์ธต

์ž…๋ ฅ์ธต์—์„œ 1์ธต์œผ๋กœ ์‹ ํ˜ธ ์ „๋‹ฌ

  • ์œ„์˜ ๊ทธ๋ฆผ์—๋Š” Bias(ํŽธํ–ฅ)์„ ์˜๋ฏธํ•˜๋Š” Neuron(๋‰ด๋Ÿฐ)์ธ 1์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Bias(ํŽธํ–ฅ)์€ ์˜ค๋ฅธ์ชฝ ์•„๋ž˜ Index๊ฐ€ ํ•˜๋‚˜๋ฐ–์— ์—…๋‹ค๋Š”๊ฒƒ์— ์ฃผ์˜ํ•˜์„ธ์š”. ์ด๋Š” ์•ž ์ธต์˜ ํŽธํ–ฅ ๋‰ด๋Ÿฐ(1)์ด ํ•˜๋‚˜๋ฟ์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด ์ด์ œ ์•Œ๊ณ ์žˆ๋Š” ์‚ฌ์‹ค๋“ค์„ ๋ฐ˜์˜ํ•˜์—ฌ ์ˆ˜์‹์œผ๋กœ ๋‚˜ํƒ€๋‚ด๊ณ , ๊ฐ„์†Œํ™” ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์™ผ: ์ˆ˜์‹, ์˜ค: ํ–‰๋ ฌ์˜ ๊ณฑ์„ ์ด์šฉํ•˜์—ฌ 1์ธต์˜ 'Weight(๊ฐ€์ค‘์น˜)'๋ถ€๋ถ„ ๊ฐ„์†Œํ™”

  • Weight(๊ฐ€์ค‘์น˜)๋ถ€๋ถ„์„ ๊ฐ„์†Œํ™” ํ–ˆ์„๋•Œ, ํ–‰๋ ฌ A, X, B, W๋Š” ๊ฐ๊ฐ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ๊ทธ๋Ÿฌ๋ฉด ์ด์ œ Numpy์˜ ๋‹ค์ฐจ์› ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•ด์„œ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
# ์ž…๋ ฅ ์‹ ํ˜ธ, ๊ฐ€์ค‘์น˜, ํŽธํ–ฅ์€ ์ ๋‹นํ•œ ๊ฐ’์œผ๋กœ ์„ค์ •
X = np.array([1.0, 0.5])
W1 = np.array([0.1, 0.3, 0.5], [0.2, 0.4, 0.6])
B1 = np.array([0.1, 0.2, 0.3])

print(W1.shape) # (2,3)
print(X.shape) # (2,)
print(B1.shape) # (3,)

A1 = np.dot(X, W1) + B1
  • ์ด ๊ณ„์‚ฐ์€ W1์€ 2 * 3 ํ–‰๋ ฌ, X๋Š” ์›์†Œ๊ฐ€ 2๊ฐœ์ธ 1์ฐจ์› ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋„ W1๊ณผ X์— ๋Œ€์‘ํ•˜๋Š” ์ฐจ์›์˜ ์›์†Œ ์ˆ˜๊ฐ€ ์ผ์น˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  • Hidden Layer์—์„œ์˜ Weight(๊ฐ€์ค‘์น˜)์˜ ํ•ฉ(๊ฐ€์ค‘ ์‹ ํ˜ธ + ํŽธํ–ฅ)์„ a๋กœ ํ‘œ๊ธฐํ•˜๊ณ  Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) h()๋กœ ๋ณ€ํ™˜๋œ ์‹ ๋กœ๋ฅผ z๋กœ ํ‘œ๊ธฐํ•ฉ๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„  Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์„ Sigmoid ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

Input Layer(์ž…๋ ฅ์ธต) -> 1์ธต Python Code Example

Z1 = sigmoid(A1)

print(A1) # [0.3, 0.7, 1.1]
print(Z1) # [0.57444252, 0.66818777, 0.75026011]

1์ธต (Hidden Layer) -> 2์ธต (Hidden Layer)

1์ธต์—์„œ 2์ธต์œผ๋กœ์˜ ์‹ ํ˜ธ ์ „๋‹ฌ

  • ์—ฌ๊ธฐ์„œ๋Š” 1์ธต์˜ Hidden Layer์˜ ์ถœ๋ ฅ Z1์ด 2์ธต์˜ Hidden Layer์˜ Input์ด ๋œ๋‹ค๋Š”์ ๋งŒ ์ œ์™ธํ•˜๋ฉด ์•ž์˜์„œ์˜ ๊ตฌํ˜„๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ Numpy ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•˜๋ฉด Layer(์ธต) ์‚ฌ์ด์˜ ์‹ ํ˜ธ ์ „๋‹ฌ์„ ์‰ฝ๊ฒŒ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

1์ธต (Hidden Layer) -> 2์ธต (Hidden Layer) Python Code Example

W2 = np.array([[0.1, 0.4], [0.2, 0.5], [0.3, 0.6]])
B2 = np.array([0.1, 0.2])

print(Z1.shape) # (3,)
print(W2.shape) # (3,2)
print(B2.shape) # (2,)

A2 = np.dot(Z1, W2) + B2
Z2 = sigmoid(A2)

2์ธต (Hidden Layer) -> ์ถœ๋ ฅ์ธต (Output Layer)

2์ธต์—์„œ ์ถœ๋ ฅ์ธต์œผ๋กœ์˜ ์‹ ํ˜ธ ์ „๋‹ฌ

  • ์—ฌ๊ธฐ์„œ ํ•ญ๋“ฑ ํ•จ์ˆ˜ identify_function()์„ ์ •์˜ํ•˜๊ณ , ์ด๋ฅผ Output Layer(์ถœ๋ ฅ์ธต)์˜ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋กœ ์ด์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ํ•ญ๋“ฑ ํ•จ์ˆ˜๋Š” Input(์ž…๋ ฅ)์„ ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. ์œ„์˜ ์˜ˆ์‹œ์—์„œ๋Š” ๊ทธ๋Œ€๋กœ identify_function()์„ ๊ตณ์ด ์ •์˜ํ•  ํ•„์š”๋Š” ์—†์ง€๋งŒ, ํ๋ฆ„์„ ํ†ต์ผํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ด๋ ‡๊ฒŒ ๊ตฌํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  Output Layer(์ถœ๋ ฅ์ธต)์˜ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๋Š” '์‹œ๊ทธ๋งˆ' σ()๋กœ ํ‘œ์‹œํ•ด์„œ Hidden Layer(์€๋‹‰์ธต)์˜ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) h()์™€๋Š” ๋‹ค๋ฆ„์„ ๋ช…์‹œํ–ˆ์Šต๋‹ˆ๋‹ค.

2์ธต (Hidden Layer) -> ์ถœ๋ ฅ์ธต (Output Layer) Python Code Example

def identity_function(x):
	return x
    
W3 = np.array([[0.1, 0.3], [0.2, 0.4]])
B3 = np.array([0.1, 0.2])

A3 = np.dot(Z2, W3) + B3
Y = identity_function(A3)

๊ตฌํ˜„ ์ •๋ฆฌ

  • ์ด์ œ 3์ธต ์‹ ๊ฒฝ๋ง์— ๋ฐํ•œ ์„ค๋ช…์€ ๋๋‚ด๊ณ , ์ง€๊ธˆ๊นŒ์ง€ ๊ตฌํ˜„ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์‹ ๊ฒฝ๋ง ๊ตฌํ˜„์˜ ๊ด€๋ก€์— ๋”ฐ๋ผ Weight(๊ฐ€์ค‘์น˜) W1๋งŒ ๋Œ€๋ฌธ์ž๋กœ ์“ฐ๊ณ , ๊ทธ ์™ธ Bias(ํŽธํ–ฅ)๊ณผ ์ค‘๊ฐ„ ๊ฒฐ๊ณผ ๋“ฑ์€ ๋ชจ๋‘ ์†Œ๋ฌธ์ž๋กœ ์ผ์Šต๋‹ˆ๋‹ค.
def init_network():
	network = {}
    network['W1'] = np.array([0.1, 0.3, 0.5], [0.2, 0.4, 0.6])
    network['b1'] = np.array([0.1, 0.2, 0.3])
    network['W2'] = np.array([[0.1, 0.4], [0.2, 0.5], [0.3, 0.6]])
    network['b2'] = np.array([0.1, 0.2])
    network['W3'] = np.array([[0.1, 0.3], [0.2, 0.4]])
    network['b3'] = np.array([0.1, 0.2])
    
    return network
    
def forward(network, x):
	W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']
    
    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(x, W2) + b2
    z2 = sigmoid(a1)
    a3 = np.dot(z2, W3) + b3
    y = identity_function(a3)
    
    return y
    
network = init_network()
x = np.array([1.0, 0.5])
y = forward(network, x)
print(y)
  • ์—ฌ๊ธฐ์„œ Init_network()์™€ forward()๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • Init_network() ํ•จ์ˆ˜๋Š” Weight(๊ฐ€์ค‘์น˜)์™€ Bias(ํŽธํ–ฅ)์„ ์ดˆ๊ธฐํ™” ํ•˜๊ณ , ์ด๋“ค์„ Dictionary ๋ณ€์ˆ˜์ธ network์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  • ์ด Dictionary ๋ณ€์ˆ˜ network๋Š” ๊ฐ Layer(์ธต)์— ํ•„์š”ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜(Weight & Bias)๋ฅผ ๋ชจ๋‘ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  • forward() ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ ์‹ ํ˜ธ๋ฅผ ์ถœ๋ ฅ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ฒ˜๋ฆฌ ๊ณผ์ •์„ ๋ชจ๋‘ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  • ๋˜ํ•œ ํ•จ์ˆ˜ ์ด๋ฆ„์„ forward() ๋ผ๊ณ  ํ•œ๊ฑด ์‹ ํ˜ธ๊ฐ€ ์ˆœ๋ฐฉํ–ฅ (์ž…๋ ฅ->์ถœ๋ ฅ)์œผ๋กœ ์ „๋‹ฌ๋จ(์ˆœ์ „ํŒŒ)์ž„์„ ์•Œ๋ฆฌ๊ธฐ ์œ„ํ•จ์ž…๋‹ˆ๋‹ค.

Output Layer(์ถœ๋ ฅ์ธต) ์„ค๊ณ„ํ•˜๊ธฐ

Neural Network(์‹ ๊ฒฝ๋ง)์€ Classification(๋ถ„๋ฅ˜), Regression(ํšŒ๊ท€)์— ๋ชจ๋‘ ์ด์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹ค๋งŒ, ๋‘˜ ์ค‘ ์–ด๋–ค ๋ฌธ์ œ๋ƒ์— ๋”ฐ๋ผ์„œ Output Layer(์ถœ๋ ฅ์ธต)์—์„œ ์‚ฌ์šฉํ•˜๋Š” Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์ด ๋‹ฌ๋ผ์ง‘๋‹ˆ๋‹ค.
  • ์ผ๋ฐ˜์ ์œผ๋กœ Regression(ํšŒ๊ท€)์—๋Š” ํ•ญ๋“ฑํ•จ์ˆ˜๋ฅผ, Classification(๋ถ„๋ฅ˜)์—๋Š” Softmax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

Identify(ํ•ญ๋“ฑ) ํ•จ์ˆ˜ & Softmax ํ•จ์ˆ˜ ๊ตฌํ˜„ํ•˜๊ธฐ

  • Identify Function(ํ•ญ๋“ฑํ•จ์ˆ˜)์€ ์ž…๋ ฅ์„ ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
  • ์ž…๋ ฅ, ์ถœ๋ ฅ์ด ํ•ญ์ƒ ๊ฐ™๋‹ค๋Š” ๋œป์˜ ํ•ญ๋“ฑ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ Output Layer(์ถœ๋ ฅ์ธต)์—์„œ Identify Function(ํ•ญ๋“ฑ ํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ž…๋ ฅ ์‹ ํ˜ธ๊ฐ€ ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅ ์‹ ํ˜ธ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
  • ํ•ญ๋“ฑ ํ•จ์ˆ˜์˜ ์ฒ˜๋ฆฌ๋Š” ์‹ ๊ฒฝ๋ง ๊ทธ๋ฆผ์œผ๋กœ ์•„๋ž˜์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ Identify Function(ํ•ญ๋“ฑ ํ•จ์ˆ˜)์— ์˜ํ•œ ๋ณ€ํ™˜์€ Hidden Layer(์€๋‹‰์ธต)์—์„œ์˜ Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ™”์‚ดํ‘œ๋กœ ๊ทธ๋ฆฝ๋‹ˆ๋‹ค.

Identify Function(ํ•ญ๋“ฑ ํ•จ์ˆ˜)

  • Classification(๋ถ„๋ฅ˜)์—์„œ ์‚ฌ์šฉํ•˜๋Š” Softmax Function(์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜)์˜ ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Softmax Function(์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜) ์ˆ˜์‹

  • ์—ฌ๊ธฐ์„œ exp(x)๋Š” e์˜ x์Šน์„ ๋œปํ•˜๋Š” Exponential Function(์ง€์ˆ˜ ํ•จ์ˆ˜)์ž…๋‹ˆ๋‹ค. (์—ฌ๊ธฐ์„œ e๋Š” ์ž์—ฐ์ƒ์ˆ˜ ์ž…๋‹ˆ๋‹ค)
  • n์€ ์ถœ๋ ฅ์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜, yk๋Š” ๊ทธ์ค‘ k๋ฒˆ์งธ ์ถœ๋ ฅ์ž„์„ ๋œปํ•ฉ๋‹ˆ๋‹ค.
  • Softmax ํ•จ์ˆ˜์˜ ๋ถ„์ž๋Š” ์ž…๋ ฅ ์‹ ํ˜ธ ak์˜ ์ง€์ˆ˜ํ•จ์ˆ˜, ๋ถ„๋ชจ๋Š” ๋ชจ๋“  ์ž…๋ ฅ ์‹ ํ˜ธ์˜ ์ง€์ˆ˜ ํ•จ์ˆ˜์˜ ํ•ฉ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.
  • Softmax์˜ ์ถœ๋ ฅ์€ ๋ชจ๋“  ์ž…๋ ฅ ์‹ ํ˜ธ๋กœ๋ถ€ํ„ฐ ํ™”์‚ดํ‘œ๋ฅผ ๋ฐ›์Šต๋‹ˆ๋‹ค. ์œ„์˜ ์ˆ˜์‹์˜ ๋ถ„๋ชจ์—์„œ ๋ณด๋ฉด, ์ถœ๋ ฅ์ธต์˜ ๊ฐ ๋‰ด๋Ÿฐ์ด ๋ชจ๋“  ์ž…๋ ฅ ์‹ ํ˜ธ์—์„œ ์˜ํ–ฅ์„ ๋ฐ›๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

Softmax ํ•จ์ˆ˜ ์‹œ๊ฐํ™”

  • ๊ทธ๋Ÿฌ๋ฉด, ํ•œ๋ฒˆ Softmax ํ•จ์ˆ˜๋ฅผ ํ•œ๋ฒˆ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
>>> a = np.array([0.3, 2.9, 4.0])
>>> exp_a = np.exp(a) # ์ง€์ˆ˜ ํ•จ์ˆ˜
[ 1.34985881 18.17414537 54.59815003]

>>> sum_exp_a = np.sum(exp_a) # ์ง€์ˆ˜ ํ•จ์ˆ˜์˜ ํ•ฉ
>>> print(sum_exp_a_
74.1221152102

>>> y = exp_a / sum_exp_a
>>> print(y)
[ 0.01821127 0.24519181 0.73659691 ]
  • ์ด๋ฒˆ์—๋Š” ์ด ํ๋ฆ„์„ Python ํ•จ์ˆ˜๋กœ ์ •์˜ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
def softmax(a):
	exp_a = np.exp(a)
    sum_exp_a = np.sum(exp_a)
    y = exp_a / sum_exp_a
    
    return y

 

Softmax ํ•จ์ˆ˜ ๊ตฌํ˜„์‹œ ์ฃผ์˜์ 

  • softmax() ํ•จ์ˆ˜๋Š” ์ปดํ“จํ„ฐ๋กœ ๊ณ„์‚ฐํ•  ๋•Œ ๊ฒฐํ•จ์ด ์žˆ์Šต๋‹ˆ๋‹ค. Overflow ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • softmax() ํ•จ์ˆ˜๋Š” ์ง€์ˆ˜ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ์ง€์ˆ˜ ํ•จ์ˆ˜๋ž€ ๊ฒƒ์ด ์•„์ฃผ ํฐ ๊ฐ’์„ ๋ฑ‰์Šต๋‹ˆ๋‹ค. e์˜ 10์Šน์€ 20,000์ด ๋„˜๊ณ , e์˜ 100์Šน์€ 0์ด 40๊ฐœ๊ฐ€ ๋„˜๋Š” ํฐ ๊ฐ’์ด ๋˜๊ณ , ๋งŒ์•ฝ e์˜ 1000์Šน์ด ๋˜๋ฉด ๋ฌดํ•œ๋Œ€๋ฅผ ์˜๋ฏธํ•˜๋Š” inf๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
  • ์ด๋Ÿฐ ํฐ ๊ฐ’์„ ๋‚˜๋ˆ—์…ˆ์„ ํ•˜๋ฉด ๊ฒฐ๊ณผ ์ˆ˜์น˜๊ฐ€ '๋ถˆ์•ˆ์ •' ํ•ด์ง‘๋‹ˆ๋‹ค.
  • ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ Softmax ํ•จ์ˆ˜๋ฅผ ๊ฐœ์„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Softmax ํ•จ์ˆ˜ ๊ตฌํ˜„ ๊ฐœ์„  ์ˆ˜์‹

  • C๋ผ๋Š” ์˜๋ฏธ์˜ ์ •์ˆ˜๋ฅผ ๋ถ„์ž์™€ ๋ถ„๋ชจ ์–‘์ชฝ์— ๊ณฑํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๋‹ค์Œ์œผ๋กœ C๋ฅผ ์ง€์ˆ˜ ํ•จ์ˆ˜ exp()์•ˆ์œผ๋กœ ์˜ฎ๊ฒจ์„œ logC๋กœ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  logC๋ฅผ ์ƒˆ๋กœ์šด ๊ธฐํ˜ธ C๋กœ ๋ด๊ฟ‰๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” ๊ฒƒ์€ Softmax์˜ ์ง€์ˆ˜ ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•  ๋•Œ ์–ด๋–ค ์ •์ˆ˜๋ฅผ ๋”ํ•ด๋„ ๊ฒฐ๊ณผ๋Š” ๋ด๋€Œ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ์ผ๋ฐ˜์ ์œผ๋กœ ์ž…๋ ฅ ์‹ ํ˜ธ์ค‘ ์ตœ๋Œ€๊ฐ’์„ ์ด์šฉํ•˜๋Š”๊ฒƒ์ด ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค.
>>> = np. array([ 1010, 1000, 990])
>>> np.exp(a) / np.sum(np.exp(a)) # softmax ํ•จ์ˆ˜์˜ ๊ณ„์‚ฐ
array([ nan, nan, nan]) # ์ œ๋Œ€๋กœ ๊ณ„์‚ฐ์ด ๋˜์ง€ ์•Š๋Š”๋‹ค.
>>>
>>> c = np.max(a) #c = 1010 (์ตœ๋Œ€๊ฐ’)
>>> a - c
array([ 0, - 10, - 20 ])
>>>
>>> np.exp(a-c) / np.sum(np.exp(a-c))
array([ 9.99954600e - 01, 4.53978686e - 05, 2.06106005e - 09])
  • ์—ฌ๊ธฐ์„œ ๋ณด๋Š”๊ฒƒ ์ฒ˜๋Ÿผ ์•„๋ฌด๋Ÿฐ ์กฐ์น˜ ์—†์ด ๊ทธ๋ƒฅ ๊ณ„์‚ฐํ•˜๋ฉด nan์ด ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค. (nan์€ not a number์˜ ์•ฝ์ž์ž…๋‹ˆ๋‹ค.)
  • ํ•˜์ง€๋งŒ ์ž…๋ ฅ ์‹ ํ˜ธ์ค‘ ์ตœ๋Œ€๊ฐ’(์ด ์˜ˆ์—์„œ๋Š” c)๋ฅผ ๋นผ์ฃผ๋ฉด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋‚ด์šฉ์— ๊ธฐ๋ฐ˜ํ•ด์„œ ๋‹ค์‹œ ๊ตฌํ˜„ํ•˜๋ฉด ์•„๋ž˜์˜ ํ•จ์ˆ˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
def softmax(a):
	c = np.max(a)
    exp_a = np.exp(a-c) # ์˜ค๋ฒ„ํ”Œ๋กœ ๋Œ€์ฑ…
    sum_exp_a = np.sum(exp_a)
    y = exp_a / sum_exp_a
    
    return y

Softmax ํ•จ์ˆ˜์˜ ํŠน์ง•

  • softmax() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์‹ ๊ฒธ๋ง์˜ ์ถœ๋ ฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
>>> a = np.array([0.3, 2.9, 4.0])
>>> y = softmax(a)
>>> print(y)
[0.01821127, 0.24519181, 0.73659691]
>>> np.sum(y)
1.0
  • Softmax ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ 0 ~ 1.0 ์‚ฌ์ด์˜ ์‹ค์ˆ˜์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•จ์ˆ˜ ์ถœ๋ ฅ์˜ ์ดํ•ฉ์€ 1์ž…๋‹ˆ๋‹ค.
  • ์ถœ๋ ฅ ์ดํ•ฉ์ด 1์ด ๋œ๋‹ค๋Š”๊ฑด Softmax ํ•จ์ˆ˜์˜ ์ค‘์š”ํ•œ ์„ฑ์งˆ์ด๋ฉฐ, ์ด ์„ฑ์งˆ ๋•๋ถ„์˜ Softmax ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ 'ํ™•๋ฅ '๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ™•๋ฅ ๋กœ ํ•ด์„ํ•  ์ˆ˜๋„ ์žˆ๋Š”๋ฐ y[0]์˜ ํ™•๋ฅ ์€ 1.8%, y[1]์˜ ํ™•๋ฅ ์€ 24.5%, y[2]์˜ ํ™•๋ฅ ์€ 73.7%๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ 2๋ฒˆ์งธ ์›์†Œ์˜ ํ™•๋ฅ ์ด ๊ฐ€์žฅ ๋†’์œผ๋‹ˆ ๋‹ต์€ 2๋ฒˆ์งธ ํด๋ž˜์Šค๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ฆ‰, Softmax ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด์„œ ๋ฌธ์ œ๋ฅผ ํ™•๋ฅ (ํ†ต๊ณ„)์ ์œผ๋กœ ๋Œ€์‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Output Layer(์ถœ๋ ฅ์ธต)์˜ Neuron(๋‰ด๋Ÿฐ) ์ˆ˜ ์ •ํ•˜๊ธฐ

  • Output Layer(์ถœ๋ ฅ์ธต)์˜ Neuron(๋‰ด๋Ÿฐ)์ˆ˜๋Š” ํ’€๋ ค๋Š” ๋ฌธ์ œ์— ๋งž๊ฒŒ ์ ์ ˆํžˆ ์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ณดํ†ต ํด๋ž˜์Šค์— ์ˆ˜์— ๋งž๊ฒŒ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค. ์ž…๋ ฅ์ด๋ฏธ์ง€๋ฅผ ์ˆซ์ž 0~9์ค‘ ํ•˜๋‚˜๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฌธ์ œ๋ฉด Output Layer(์ถœ๋ ฅ์ธต)์˜ ๋‰ด๋Ÿฐ์„ 10๊ฐœ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

์ถœ๋ ฅ์ธ ์˜ ๋‰ด๋Ÿฐ์€ ๊ฐ ์ˆซ์ž์— ๋Œ€์‘ํ•ฉ๋‹ˆ๋‹ค.

  • ์œ„์—์„œ ๋ถ€ํ„ฐ ์ถœ๋ ฅ์ธต ๋‰ด๋Ÿฐ์€ ์ฐจ๋ก€๋กœ ์ˆซ์ž 0~9 ๊นŒ์ง€ ๋Œ€์‘ํ•˜๋ฉฐ, ๋‰ด๋Ÿฐ์˜ ํšŒ์ƒ‰ ๋†๋„๊ฐ€ ํ•ด๋‹น ๋‰ด๋Ÿฐ์˜ ์ถœ๋ ฅ ๊ฐ’์˜ ํฌ๊ธฐ๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ๋Š” ์ƒ‰์ด ๊ฐ€์žฅ ๊น‰์€ y2 ๋‰ด๋Ÿฐ์ด ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ์‹ ๊ฒฝ๋ง์—์„œ๋Š” y2, ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ์ˆซ์ž '2'๋กœ ํŒ๋‹จํ–ˆ์Šต๋‹ˆ๋‹ค.

Example - Mnist Dataset

์ง€๊ธˆ๊นŒ์ง€ ์•Œ์•„๋ณธ ์‹ ๊ฒฝ๋ง์— ๋ฐํ•œ ์˜ˆ์‹œ๋ฅผ ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ฐ”๋กœ Mnist ๋ผ๊ณ  ํ•˜๋Š” ์†๊ธ€์”จ ์ˆซ์ž ๋ถ„๋ฅ˜์ž…๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ๋Š” ์ด๋ฏธ ํ•™์Šต๋œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต ๊ณผ์ •์€ ์ƒ๋žต๋ผ๊ณ , ์ถ”๋ก  ๊ณผ์ •๋งŒ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ ์ด ์ถ”๋ก  ๊ณผ์ •์„ ์‹ ๊ฒฝ๋ง์˜ Forward Propagation(์ˆœ์ „ํŒŒ)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • Mnist Datasdet์€ 0~9 ๊นŒ์ง€์˜ ์ˆซ์ž ์ด๋ฏธ์ง€๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, Training Image๊ฐ€ 60,000์žฅ, Test Image๊ฐ€ 10,000์žฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด ํ›ˆ๋ จ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ณ , ํ•™์Šตํ•œ ๋ชจ๋ธ๋กœ ์‹คํ—˜ ์ด๋ฏธ์ง€๋ฅผ ์–ผ๋งˆ๋‚˜ ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

Mnist Image Dataset ์˜ˆ์‹œ

  • Mnist์˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋Š” 28 * 28์˜ ํšŒ์ƒ‰์กฐ ์ด๋ฏธ์ง€ (1์ฑ„๋„)์ด๋ฉฐ, ๊ฐ ํ”ฝ์…€์€ 0์—์„œ 255๊นŒ์ง€์˜ ๊ฐ’์„ ์ทจํ•ฉ๋‹ˆ๋‹ค.
  • ๋˜ํ•œ ๊ฐ ์ด๋ฏธ์ง€์—๋Š” 7, 2, 1, 3, 5์™€ ๊ฐ™์ด ๊ทธ ์ด๋ฏธ์ง€๊ฐ€ ์…€์ œ ์˜๋ฏธํ•˜๋Š” ์ˆซ์ž๊ฐ€ label๋กœ ๋ถ™์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • load_minst ํ•จ์ˆ˜๋Š” ์ฝ์€ MNIST ๋ฐ์ดํ„ฐ๋ฅผ "(ํ›ˆ๋ จ์ด๋ฏธ์ง€, ํ›ˆ๋ จ๋ ˆ์ด๋ธ”), (์‹คํ—˜์ด๋ฏธ์ง€, ์‹คํ—˜๋ ˆ์ด๋ธ”)" ํ˜•์‹์œผ๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ์ธ์ˆ˜๋กœ๋Š” normalize, flatten, one_hot_label 3๊ฐœ๊ฐ€ ์žˆ๋Š”๋ฐ ์„ธ์ธ์ˆ˜ ๋ชจ๋‘ bool๊ฐ’์ž…๋‹ˆ๋‹ค.
    • normalize๋Š” ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€ ๊ฐ’์„ 0.0~1.0 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ์ •๊ทœํ™” ํ• ์ง€๋ฅผ ์ •ํ•ฉ๋‹ˆ๋‹ค. False๋ฉด ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€ ๊ฐ’์€ 0~255์‚ฌ์ด์˜ ์›๋ž˜๊ฐ’์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
    • flatten์€ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ 1์ฐจ์› ๋ฐฐ์—ด๋กœ ๋งŒ๋“ค์ง€๋ฅผ ์ •ํ•˜๋ฉฐ, False๋กœ ํ•˜๋ฉด 1 * 28 * 28, 3์ฐจ์› ๋ฐฐ์—ด, True๋กœ ํ•˜๋ฉด 784๊ฐœ์˜ ์›์†Œ๋กœ ์ด๋ฃจ์–ด์ง„ 1์ฐจ์› ๋ฐฐ์—ด๋กœ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. (28 * 28 = 784)
    • one_hot_label์€ label์„ one_hot_encoding ํ˜•ํƒœ๋กœ ์ €์žฅํ• ์ง€๋ฅผ ์ •ํ•ฉ๋‹ˆ๋‹ค.
      • ์˜ˆ์‹œ๋ฅผ ๊ฐ„๋‹จํ•˜๊ฒŒ ๋“ค์–ด๋ณด๋ฉด [0,0,0,0,1,0,0] ์ด๋Ÿฌ๋ฉด ์ •๋‹ต์„ ๋œปํ•˜๋Š” ์›์†Œ๋Š” '1'์ด๊ณ  ๋‚˜๋จธ์ง€๋Š” '0'์ธ ๋ฐฐ์—ด์ž…๋‹ˆ๋‹ค.
      • ๋งŒ์•ฝ label์ด False๋ฉด '7', '2'๊ฐ™์ด ์ˆซ์ž ํ˜•ํƒœ์˜ label์„ ์ €์žฅํ•˜๊ณ , True์ด๋ฉด label์„ one_hot_encodingํ•˜์—ฌ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
import sys
import os
import pickle
import numpy as np
sys.path.append(os.pardir)  # ๋ถ€๋ชจ ๋””๋ ‰ํ„ฐ๋ฆฌ์˜ ํŒŒ์ผ์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ๋„๋ก ์„ค์ •
from dataset.mnist import load_mnist
from PIL import Image

def img_show(img):
	pil_img = Image.fromarray(np.uint8(img))
    pil_img.show()
    
(x_train, t_train), (x_test, t_test) = \
	load_mnist(flatten=True, normalize=False)

img = x_train[0]
label = t_train[0]
print(label) # 5

print(img.shape) # (784,)
img = img.reshape(28, 28) # ์›๋ž˜ ์ด๋ฏธ์ง€์˜ ๋ชจ์–‘์œผ๋กœ ๋ณ€ํ˜•
print(img.shape) # (28, 28)

img_show(img)
  • ์ด ์ฝ”๋“œ์—์„œ ์•Œ์•„์•ผํ•˜๋Š”๊ฑด flatten=True๋กœ ์„ค์ •ํ•ด ์ฝ์–ด ๋“ค์ธ ์ด๋ฏธ์ง€๋Š” 1์ฐจ์› ๋„˜ํŒŒ์ด ๋ฐฐ์—ด๋กœ ์ €์žฅ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ž˜์„œ ์ด๋ฏธ์ง€๋ฅผ ํ‘œ์‹œํ•  ๋•Œ๋Š” ์›๋ž˜ ํ˜•์ƒ์ธ 28 * 28 ํฌ๊ธฐ๋กœ ๋ณ€ํ˜•ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ, reshape() Method์— ์›ํ•˜๋Š” ํ˜•์ƒ์„ ์ธ์ˆ˜๋กœ ์ง€์ •ํ•˜๋ฉด ๋„˜ํŒŒ์ด ๋ฐฐ์—ด์˜ ํ˜•์ƒ์„ ๋ด๊ฟ€์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  Numpy๋กœ ์ €์žฅ๋œ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ PIL์šฉ ๋ฐ์ดํ„ฐ ๊ฐ์ฒด๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋ฉฐ, ์ด๋ฏธ์ง€ ๋ณ€ํ™˜์€ Image.fromarray() ํ•จ์ˆ˜๊ฐ€ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์‹ ๊ฒฝ๋ง์˜ ์ถ”๋ก  ์ฒ˜๋ฆฌ

  • ์—ฌ๊ธฐ์„œ Input Layer Neuron(์ž…๋ ฅ์ธต ๋‰ด๋Ÿฐ)์„ 784๊ฐœ, Output Layer Neuron(์ถœ๋ ฅ์ธต ๋‰ด๋Ÿฐ)์„ 10๊ฐœ๋กœ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ์ž…๋ ฅ์ธต ๋‰ด๋Ÿฐ์˜ ๊ฐœ์ˆ˜๊ฐ€ 784๊ฐœ์ธ ์ด๋‰ด๋Š” ์ด๋ฏธ์ง€ ํฌ๊ธฐ๊ฐ€ 28*28=784์ด๊ธฐ ๋•Œ๋ฌธ์ด๊ณ , ์ถœ๋ ฅ์ธต ๋‰ด๋Ÿฐ์ด 10๊ฐœ์ธ ์ด์œ ๋Š” 0์—์„œ 9๊นŒ์ง€์˜ ์ˆซ์ž๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  • Hidden Layer(์€๋‹‰์ธต)์€ 2๊ฐœ๋กœ, ์ฒซ๋ฒˆ์งธ๋Š” 50๊ฐœ, ๋‘๋ฒˆ์งธ๋Š” 100๊ฐœ์˜ ๋‰ด๋Ÿฐ์„ ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค.
def get_data():
    (x_train, t_train), (x_test, t_test) = load_mnist(flatten=True, normalize=True, one_hot_label=False)
    return x_test, t_test


def init_network():
    with open("sample_weight.pkl", 'rb') as f:
        # ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ๋‹ด๊ธด ํŒŒ์ผ
        # ํ•™์Šต ์—†์ด ๋ฐ”๋กœ ์ถ”๋ก ์„ ์ˆ˜ํ–‰
        network = pickle.load(f)

    return network


def predict(network, x):
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']
    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 = np.dot(z2, W3) + b3
    y = softmax(a3)

    return y
  • init_network() ํ•จ์ˆ˜์—์„œ๋Š” pickle ํŒŒ์ผ์ธ sample_weight.pkl์— ์ €์žฅ๋œ 'ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜'ํŒŒ์ผ์„ ์ฝ์Šต๋‹ˆ๋‹ค.
  • ์ด ํŒŒ์ผ ์•ˆ์—๋Š” Weight(๊ฐ€์ค‘์น˜), Bias(ํŽธํ–ฅ) ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ Dictionary ๋ณ€์ˆ˜๋กœ ์ €์žฅ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
x, t = get_data()
network = init_network()
accuracy_cnt = 0

for i in range(len(x)):
    y = predict(network, x[i])
    p = np.argmax(y)  # ํ™•๋ฅ ์ด ๊ฐ€์žฅ ๋†’์€ ์›์†Œ์˜ ์ธ๋ฑ์Šค๋ฅผ ์–ป๋Š”๋‹ค.
    if p == t[i]:
        accuracy_cnt += 1

print("Accuracy:" + str(float(accuracy_cnt) / len(x)))  # Accuracy: 0.9352
  • ์œ„์˜ ์ฝ”๋“œ๋Š” ์ •ํ™•๋„๋ฅผ ํŒ๋‹จํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
  • ๋ฐ˜๋ณต๋ฌธ์„ ๋Œ๋ฉด์„œ x์— ์ €์žฅ๋œ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ 1์žฅ์”ฉ ๊บผ๋‚ด์„œ predict() ํ•จ์ˆ˜๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ , ๊ฐ label์˜ ํ™•๋ฅ ์„ numpy ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  np.argmax() ํ•จ์ˆ˜๋กœ ์ด ๋ฐฐ์—ด์—์„œ ๊ฐ€์žฅ ํฐ(ํ™•๋ฅ ๊ฐ’์ด ์ œ์ผ ๋†’์€) ์›์†Œ์˜ index๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ์‹ ๊ฒฝ๋ง์ด predict(์˜ˆ์ธก)ํ•œ ๋‹ต๋ณ€๊ณผ ์ •๋‹ต label์„ ๋น„๊ตํ•˜์—ฌ ๋งžํžŒ ์ˆซ์ž(accuracy_cnt)๋ฅผ ์„ธ๊ณ , ์ด๋ฅผ ์ „์ฒด ์ด๋ฏธ์ง€ ์ˆซ์ž๋กœ ๋‚˜๋ˆ  ์ •ํ™•๋„๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

๋ฐฐ์น˜ ์ฒ˜๋ฆฌ

์‹ ๊ฒฝ๋ง ๊ฐ ์ธต์˜ ๋ฐฐ์—ด ํ˜•์ƒ์˜ ์ถ”์ด

  • ์œ„์˜ ๊ทธ๋ฆผ์„ ๋ณด๋ฉฐ๋Š ์›์†Œ 784๊ฐœ๋กœ ๊ตฌ์„ฑ๋œ 1์ฐจ์› ๋ฐฐ์—ด (์›๋ž˜๋Š” 28 * 28์ธ 2์ฐจ์› ๋ฐฐ์—ด)์ด ์ž…๋ ฅ๋˜์–ด ๋งˆ์ง€๋ง‰์—๋Š” ์›์†Œ๊ฐ€ 10๊ฐœ์ธ 1์ฐจ์› ๋ฐฐ์—ด์ด ์ถœ๋ ฅ๋˜๋Š” ํ๋ฆ„์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ 1์žฅ๋งŒ ์ž…๋ ฅํ–ˆ์„ ๋•Œ์˜ ํ๋ฆ„์ž…๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋ฉด ๋งŒ์•ฝ์— ์ด๋ฏธ์ง€๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ๋ฅผ ํ•œ๊บผ๋ฒˆ์— ์ž…๋ ฅํ•˜๋Š” ๊ฒฝ์šฐ๋Š” ์–ด๋–ป๊ฒŒ ๋ ๊นŒ์š”? ์ด๋ฏธ์ง€ 100๊ฐœ๋ฅผ ๋ฌถ์–ด์„œ predict() ํ•จ์ˆ˜์— ํ•œ๋ฒˆ์— ๋„˜๊น๋‹ˆ๋‹ค. x์˜ ํ˜•์ƒ์„ 100 * 784๋กœ ๋ด๊ฟ”์„œ 100์žฅ ๋ถ„๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๋ฉ๋‹ˆ๋‹ค.

๋ฐฐ์น˜ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๋ฐฐ์—ด๋“ค์˜ ํ˜•์ƒ ์ถ”์ด

  • ์œ„์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ์€ 100 * 784, ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ์€ 100 * 10 ์ž…๋‹ˆ๋‹ค.
  • ์ด๋Š” 100์žฅ ๋ถ„๋Ÿ‰ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๊ฒฐ๊ณผ๊ฐ€ ํ•œ ๋ฒˆ์— ์ถœ๋ ฅ๋จ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. x[0], y[0]์— ์žˆ๋Š” 0๋ฒˆ์งธ ์ด๋ฏธ์ง€์™€ ๊ทธ ์ถ”๋ก ๊ฒฐ๊ณผ๊ฐ€ x[1]๊ณผ y[1]์—๋Š” 1๋ฒˆ์งธ์˜ ์ด๋ฏธ์ง€์™€ ๊ทธ ๊ฒฐ๊ณผ๊ฐ€ ์ €์žฅ๋˜๋Š” ์‹์ž…๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ ํ•˜๋‚˜๋กœ ๋ฌถ์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ Batch(๋ฐฐ์น˜)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
x, t = get_data()
network = init_network()

batch_size = 100 # ๋ฐฐ์น˜ ํฌ๊ธฐ
accuracy_cnt = 0

for i in range(0, len(x), batch_size):
    x_batch = x[i:i+batch_size]
    y_batch = predict(network, x_batch)
    p = np.argmax(y_batch, axis=1)
    accuracy_cnt += np.sum(p == t[i:i+batch_size])

print("Accuracy:" + str(float(accuracy_cnt) / len(x)))  # Accuracy:0.9352
  • ์œ„์—์„œ range() ํ•จ์ˆ˜๋Š” (start, end)์ฒ˜๋Ÿผ ์ธ์ˆ˜๋ฅผ 2๊ฐœ ์ง€์ •ํ•ด์„œ ํ˜ธ์ถœํ•˜๋ฉด start์—์„œ end-1 ๊นŒ์ง€์˜ ์ •์ˆ˜๋ฅผ ์ฐจ๋ก€๋กœ ๋ฐ˜ํ™˜ํ•˜๋Š” Iterator(๋ฐ˜๋ณต์ž)๋ฅผ ๋Œ๋ ค์ค๋‹ˆ๋‹ค.
  • ๋˜ range(start, end, step)์ฒ˜๋Ÿผ ์ธ์ˆ˜๋ฅผ 3๊ฐœ ์ง€์ •ํ•˜๋ฉด start์—์„œ end-1 ๊นŒ์ง€ step ๊ฐ„๊ฒฉ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ์ •์ˆ˜๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๋ฐ˜๋ณต์ž๋ฅผ ๋Œ๋ ค์ค๋‹ˆ๋‹ค.
  • ์ด range() ํ•จ์ˆ˜๊ฐ€ ๋ฐ˜๋ณตํ•˜๋Š” ๋ฐ˜๋ณต์ž๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ x[i:i+batch_size]์—์„œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฌถ์Šต๋‹ˆ๋‹ค.
  • x[i:i+batch_size]์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ i๋ฒˆ์งธ ๋ถ€ํ„ฐ i+batch_size๋ฒˆ์งธ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฌถ๋Š”๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.
  • ์—ฌ๊ธฐ์„œ๋Š” batch_size๊ฐ€ 100์ด๋ฏ€๋กœ x[0:100], x[100:200] ์ฒ˜๋Ÿผ ์•ž์—์„œ ๋ถ€ํ„ฐ 100์žฅ์”ฉ ๋ฌถ์–ด ๊บผ๋‚ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  argmax()๋Š” ์ตœ๋Œ€๊ฐ’์˜ index๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. 
  • ๋งˆ์ง€๋ง‰์œผ๋กœ '==' ์—ฐ์‚ฐ์ž๋ฅผ ์ด์šฉํ•ด์„œ Numpy ๋ฐฐ์—ด๊ณผ ๋น„๊ตํ•˜์—ฌ True, False๋กœ ๊ตฌ์„ฑ๋œ Bool ๋ฐฐ์—ด๋กœ ๋งŒ๋“ค๊ณ , ์ด ๊ฒฐ๊ณผ ๋ฐฐ์—ด์—์„œ True๊ฐ€ ๋ช‡๊ฐœ์ธ์ง€ ์ƒ™๋‹ˆ๋‹ค.

Summary

- ์‹ ๊ฒฝ๋ง์—์„œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, ReLU ํ•จ์ˆ˜ ๊ฐ™์ด ๋งค๋„๋Ÿฝ๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•œ๋‹ค.
- Numpy์˜ ๋‹ค์ฐจ์› ๋ฐฐ์—ด์„ ์ž˜ ์‚ฌ์šฉํ•˜๋ฉด ์‹ ๊ฒฝ๋ง์„ ํšจ์œจ์ ์œผ๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.
- Maching Learning ๋ฌธ์ œ๋Š” ํฌ๊ฒŒ ํšŒ๊ท€์™€ ๋ถ„๋ฅ˜๋กœ ๋‚˜๋ˆŒ์ˆ˜ ์žˆ๋‹ค.
- ์ถœ๋ ฅ์ธต์˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ - ํšŒ๊ท€์—์„œ๋Š” ์ฃผ๋กœ ํ•ญ๋“ฑ ํ•จ์ˆ˜, ๋ถ„๋ฅ˜์—์„œ๋Š” Softmax ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•œ๋‹ค.
- ๋ถ„๋ฅ˜์—์„œ๋Š” ์ถœ๋ ฅ์ธ ์œผ์ด ๋‰ด๋Ÿฐ ์ˆ˜๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋ ค๋Š” ํด๋ž˜์Šค ์ˆ˜์™€ ๊ฐ™๊ฒŒ ์„ค์ •ํ•œ๋‹ค.
- ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฌถ์€ ๊ฒƒ์„ ๋ฐฐ์น˜๋ผ๊ณ  ํ•˜๋ฉฐ, ์ถ”๋ก  ์ฒ˜๋ฆฌ๋ฅผ ์ด ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ์ง„ํ–‰ํ•˜๋ฉด ๊ฒฐ๊ณผ๋ฅผ ํ›จ์”ฌ ๋น ๋ฅด๊ฒŒ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.