A A
[NLP] Word2Vec, CBOW, Skip-Gram - ๊ฐœ๋… & Model

1. What is Word2Vec?

Word2Vec์€ ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์ธ๊ธฐ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋‹จ์–ด๋Š” ๋ณดํ†ต 'Token' ํ† ํฐ ์ž…๋‹ˆ๋‹ค.
  • ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋‹จ์–ด(Token)๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ์  ๊ด€๊ณ„๋ฅผ Vector ๊ณต๊ฐ„์— ์ž˜ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ํ•™์Šตํ•˜๋Š” ๋น„์ง€๋„๋ฐฉ์‹(Unsupervised learning)์œผ๋กœ ์„ค๊ณ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ž…๋‹ˆ๋‹ค.
  • ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค(๋ฌธ๋งฅ)์„ ํ†ตํ•ด์„œ ๊ฐ ๋‹จ์–ด๋“ค์„ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜, ๋ฐ˜๋Œ€๋กœ ๊ฐ ๋‹จ์–ด๋“ค์„ ํ†ตํ•ด ์ฃผ๋ณ€์˜ ๋‹จ์–ด๋“ค์„ ๋ณด๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
    • ๋น„์œ  ํ•˜์ž๋ฉด ์ด๋ฏธ์ง€๋ฅผ ํ•™์Šตํ•˜๋“ฏ, ๋‹จ์–ด๋ฅผ Vector๋กœ ๋ณด๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

  • ์ด๋ ‡๊ฒŒ Word2Vec์€ ๋‹จ์–ด๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ์ ์ธ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค.

Word2Vec์€ ๋‹จ์–ด๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ์ ์ธ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…

  • ๊ทธ๋ฆฌ๊ณ , ์œ„์˜ ๊ทธ๋ฆผ์— ์žˆ๋Š” ๋ฌธ์žฅ์„ ์ด์šฉํ•ด ๋ชจ๋ธ์„ ํ•™์Šต ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ ๊ฐ ๋‹จ์–ด(Token)๋“ค์„ ์—ฐ์‚ฐ์ด ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜, ์ฆ‰ Vector๋กœ ๋ณ€ํ™˜ํ•ด์„œ ์—ฐ์‚ฐ์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋Ÿฐ๋ฐ, Word2Vec์— ๋ฐํ•˜์—ฌ ๊ณต๋ถ€ํ•ด๋ณด๋‹ˆ๊นŒ, ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค์„ ์ถ”๋ก ํ•ด์„œ ์˜๋ฏธ์ ์ธ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•œ๋‹ค๋Š” ๋ง์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • Word2Vec๋Š” ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ๊ฐ€๋Šฅํ•œ ์ž˜ ํ‘œํ˜„ํ•˜๋Š” Vector๋ฅผ ํ•™์Šตํ•˜๋Š”๊ฒƒ์ด ์ฃผ ๋ชฉ์ ์ด๊ธฐ ๋•Œ๋ฌธ์— '์ถ”๋ก '์ด๋ผ๋Š” ๋ง ๋ณด๋‹ค๋Š” 'ํ‘œ๋ฉด ํ•™์Šต'์ด๋ผ๋Š” ๊ฐœ๋…์— ๋” ๊ฐ€๊น๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๋‹ค๋งŒ, ์ถ”๋ก ์€ Word2Vec๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Vectorํ™”๋ฅผ ํ•œํ›„ ๋ฌธ์žฅ or ๋ฌธ์„œ๋ฅผ ๋ถ„๋ฅ˜, ์•„๋‹ˆ๋ฉด ๋‹จ์–ด์˜ ์œ ์‚ฌ์„ฑ์„ ๊ณ„์‚ฐํ• ๋•Œ์— ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

 

์˜ˆ์‹œ๋ฅผ ํ•œ๋ฒˆ ๋“ค์–ด๋ณด๋ฉด Word2Vec ๋ฒกํ„ฐ๋กœ ๋”ํ•˜๊ธฐ ๋นผ๊ธฐ ์—ฐ์‚ฐ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 
  • Word2Vec๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด '์™•' - '๋‚จ์ž' + '์—ฌ์ž' =  '์—ฌ์™•'
  • ์ด๋ ‡๊ฒŒ ์ขŒ๋ณ€์„ ๋„ฃ์–ด๋ฒ„๋ฆฌ๋ฉด ์šฐ๋ณ€์˜ ๋‹ต์ด ๋‚˜์˜ค๋Š”๊ฒƒ ์ฒ˜๋Ÿผ ์ด๋ ‡๊ฒŒ์— ๊ฐ€๊นŒ์šด ๋ฒกํ„ฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋Ÿฐ ์—ฐ์‚ฐ์ด ๊ฐ€๋Šฅํ•œ ์ด์œ ๋Š” ์œ„์—์„œ ์„ค๋ช…ํ–ˆ์ง€๋งŒ ๊ฐ ๋‹จ์–ด์˜ Vector ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” ๋ฐฉ์‹์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

2. ํฌ์†Œ ํ‘œํ˜„ (Sparse Repesentation)

  • ๊ทธ๋ฆฌ๊ณ  Word2Vec์€ ๋‹จ์–ด(Token)๋“ค์„ One-hot Vector๋กœ ํ‘œํ˜„ ํ•œ๋‹ค๋Š” ํŠน์ง•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹จ์–ด๋“ค์˜ ๋ชจ์Œ์ง‘ (์ฆ‰, ๊ณ ์œ ์‚ฌ์ „ or ๋ง๋ญ‰์น˜(Corpus))๋ฅผ ๊ฐ€์ง€๊ณ  One-hot-Vector๋กœ ํ‘œํ˜„ํ•˜๋Š” ํŠน์ง•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

One-hot Vector ์˜ˆ์‹œ ๊ทธ๋ฆผ

  • ์ด๋ ‡๊ฒŒ Vector์˜ ๊ฐ’์„ ๋Œ€๋ถ€๋ถ„ 0์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํฌ์†Œ ํ‘œํ˜„(Sparse Representation)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
์˜ˆ์‹œ๋ฅผ ํ•œ๋ฒˆ ๋“ค์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด์„œ 'cat', 'dog', 'apple' ์„ธ ๋‹จ์–ด๊ฐ€ ์žˆ์„ ๋•Œ, 'cat'์€ [1, 0, 0], 'dog'๋Š” [0, 1, 0], 'apple'์€ [0, 0, 1]๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.
  • ๊ทผ๋ฐ ํฌ์†Œ ํ‘œํ˜„(Sparse Representation)์˜ ๋‹จ์ ์€ ๊ฐ ๋‹จ์–ด Vector๊ฐ„ ์œ ์˜๋ฏธํ•œ ์œ ์‚ฌ์„ฑ์„ ํ‘œํ˜„ํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋ž˜์„œ ์ด ๋‹จ์ ์„ ๋‹ค์ฐจ์› ๊ณต๊ฐ„์— ๋ฒกํ„ฐํ™” ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, ์ด ํ‘œํ˜„ ๋ฐฉ๋ฒ•์€ ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
    • ๊ทธ๋ฆฌ๊ณ  ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์„ ์ด์šฉํ•˜์—ฌ ๋‹จ์–ด ๊ฐ„ ์˜๋ฏธ๋ฅผ ๋‹ค์ฐจ์› ๊ณต๊ฐ„์— Vectorํ™” ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ Word Embedding ๋ฐฉ๋ฒ•์ด๋ผ๊ณ  ํ•˜๋ฉฐ, Word Embedding ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ด์„œ ํ‘œํ˜„ํ•œ ๋ฒกํ„ฐ๋ฅผ Embedding Vector๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • Word Embedding์— ๋ฐํ•ด์„  ๋‹ค์Œ๊ธ€์—์„œ ์ ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

3. ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)

๋ถ„์‚ฐํ‘œํ˜„(Distributed Representation)์€ ๋‹จ์–ด(Token)๋ฅผ ๊ณ ์ฐจ์›์—์„œ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ํ•˜์ง€๋งŒ, ๊ฐ ๋‹จ์–ด(Token)๋Š” ์—ฌ๋Ÿฌ ์ฐจ์›์— ๋ถ„์‚ฐ๋˜์–ด ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.
  • ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์€ ๊ฐ ์ฐจ์›์˜ ํŠน์ • ์˜๋ฏธ์  ํŠน์„ฑ์„ ํฌ์ฐฉํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  • ๋˜ํ•œ ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์€ ๋ถ„ํฌ ๊ฐ€์„ค์„ ์ด์šฉํ•˜์—ฌ Text๋ฅผ ํ•™์Šตํ•˜๊ณ , ๋‹จ์–ด(Token)์˜ ์˜๋ฏธ๋ฅผ Vector์˜ ์—ฌ๋Ÿฌ Dimension(์ฐจ์›)์— ๋ถ„์‚ฐํ•˜์—ฌ ํ‘œํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ Vector๋ฅผ ํ‘œํ˜„์„ ํ•˜๋ฉด ๋ฒกํ„ฐ์˜ Dimension(์ฐจ์›)์ด ์ €์ฐจ์›์œผ๋กœ ์ค„์–ด๋“ญ๋‹ˆ๋‹ค.
    • ์ด์œ ๋Š” ๋ถ„์‚ฐ ํ‘œํ˜„(Distributed Representation)์œผ๋กœ ํ‘œํ˜„๋œ Vector๋“ค์€ One-hot Vector์ฒ˜๋Ÿผ Vector์˜ ์ฐจ์›์ด ๋‹จ์–ด ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ์ผ ํ•„์š”๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์ด๊ธฐ๋„ ํ•˜๊ณ , ๋‹ค์‹œ ์„ค๋ช…ํ•˜์ž๋ฉด ๊ฐ Dimension(์ฐจ์›)์ด ๋‹จ์–ด์˜ ์˜๋ฏธ์  ํŠน์ง•์„ ํฌ์ฐฉํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
์˜ˆ์‹œ๋ฅผ ํ•œ๋ฒˆ ๋“ค์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. 
  • ํ•œ Dimension(์ฐจ์›)์€ '์ƒ๋ฌผ' vs '๋ฌด์ƒ๋ฌผ', ๋‹ค๋ฅธ Dimension(์ฐจ์›)์€ '๊ธ์ •์ ' vs '๋ถ€์ •์ ' ๊ฐ™์€ ์˜๋ฏธ์  ํŠน์„ฑ์„ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๊ฐ ๋‹จ์–ด๋Š” ์—ฌ๋Ÿฌ Dimension(์ฐจ์›)์€ ์—ฌ๋Ÿฌ ์ฐจ์›์— ๊ฑธ์ณ ๋ถ„์‚ฐ๋˜์–ด์„œ ํ‘œํ˜„ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ๋” ์ž˜ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ ๋ถ„์‚ฐ ํ‘œํ˜„์€ ๋‹จ์–ด(Token)์˜ ์˜๋ฏธ๋ฅผ ๋” ํšจ์œจ์ ์œผ๋กœ, ๋” ๋‚ฎ์€ ์ฐจ์›์—์„œ ํ‘œํ˜„ํ•˜๊ฒŒ ํ•ด์ฃผ๋ฉฐ, ์˜๋ฏธ์  ๊ด€๊ณ„๋ฅผ ํฌ์ฐฉํ•˜๋ฉด์„œ ๊ณ„์‚ฐ์  ํšจ์œจ์„ฑ์„ ๋†’์ด๋Š”๋ฐ ๋„์›€์„ ์ค๋‹ˆ๋‹ค.
  • ๋Œ€ํ‘œ์ ์ธ ํ•™์Šต ๋ฐฉ์‹์ด ๋ฐ”๋กœ Word2Vec ์ž…๋‹ˆ๋‹ค.

4. CBOW(Continuous Bag of Words)

CBOW ๋ชจ๋ธ์€ ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค(Context-๋ฌธ๋งฅ)์„ input์œผ๋กœ ์‚ฌ์šฉํ•ด์„œ ์ค‘๊ฐ„์— ์žˆ๋Š” ๋‹จ์–ด๋“ค์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
  • ํ•œ๋ฒˆ ๋งค์ปค๋‹ˆ์ฆ˜์„ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฌธ: "The fat cat sat on the mat"
  • ์ด๋ ‡๊ฒŒ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์–ดํœ˜์ง‘ํ•ฉ(Corpus)์— ์ด๋Ÿฐ ๋‹จ์–ด๋“ค์ด ์žˆ๋‹ค๊ณ  ํ•ด๋ด…์‹œ๋‹ค.
    • ['The', 'fat', 'cat', 'on', 'the', mat'] ์—์„œ 'sat' ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š”๊ฑด CBOW๊ฐ€ ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด๋•Œ ์˜ˆ์ธกํ•ด์•ผ ํ•˜๋Š” ๋‹จ์–ด 'sat'์€ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋ผ๊ณ  ํ•˜๊ณ , ์˜ˆ์ธก์— ์‚ฌ์šฉ๋˜๋Š” ๋“ค์€ Context Word(์ฃผ๋ณ€ ๋‹จ์–ด)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ์šฐ๋ฆฌ๊ฐ€ ์—ฌ๊ธฐ์„œ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์ธ 'sat' ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์•ž, ๋’ค๋กœ ๋ช‡๊ฐœ์˜ ๋‹จ์–ด๋ฅผ ๋ณผ์ง€ ์•Œ์•„์•ผ ํ•˜๋Š”๋ฐ, ์ด ์•ž ๋’ค์˜ ๋ฒ”์œ„๋ฅผ Window(์œˆ๋„์šฐ)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

CBOW Example

"The fat cat sat on the mat" ์˜ˆ๋ฌธ์„ ๋ณด๋ฉด์„œ ์„ค๋ช…์„ ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.
  • Window(์œˆ๋„์šฐ)์˜ ํฌ๊ธฐ๊ฐ€ 2์ด๊ณ , Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋กœ 'sat' ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Ÿฌ๋ฉด, ์•ž์˜ ๋‘๋‹จ์–ด fat, cat ๋’ค์˜ ๋‘๋‹จ์–ด on, the๋ฅผ input์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • Window(์œˆ๋„์šฐ)์˜ ํฌ๊ธฐ๋ฅผ n์ด๋ผ๊ณ  ํ•˜๋ฉด, ์‹ค์ œ์˜ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ฐธ๊ณ ํ•ด์•ผ ํ•˜๋Š” ๋‹จ์–ด์˜ ๊ฐœ์ˆ˜๋Š” 2n์ž…๋‹ˆ๋‹ค.

CBOW(Continuous Bag of Words) ์˜ˆ์ธก ๋ฐฉ์‹ ์‹œ๊ฐํ™”. ์ถœ์ฒ˜:https://wikidocs.net/22660

  • Window(์œˆ๋„์šฐ)์˜ ํฌ๊ธฐ๊ฐ€ ์ •ํ•ด์ง€๋ฉด ์˜†์œผ๋กœ ์›€์ง์—ฌ์„œ ์ฃผ๋ณ€ ๋‹จ์–ด์™€ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์˜ ์„ ํƒ์„ ๋ด๊พธ๋ฉด์„œ ํ•™์Šต์— ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•œ Dataset์„ ๋งŒ๋“œ๋Š”๋ฐ ์ด๊ฑธ Sliding Window(์Šฌ๋ผ์ด๋”ฉ ์œˆ๋„์šฐ)๋ฐฉ๋ฒ• ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ์œ„์˜ ๊ทธ๋ฆผ์€ Word2Vec ์—์„œ์˜ input์€ ๋ชจ๋‘ One-Hot Vector๊ฐ€ ๋˜๋Š”๋ฐ, ๊ทธ๋ฆผ์—์„œ ์˜ค๋ฅธ์ชฝ ํ‘œ๋Š” ์ค‘์‹ฌ & ์ฃผ๋ณ€ ๋‹จ์–ด์˜ ์„ ํƒ์— ๋”ฐ๋ผ์„œ ๊ฐ๊ฐ One-Hot Vector๊ฐ€ ๋˜๋Š”์ง€๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. (CBOW์— ์‚ฌ์šฉ๋˜๋Š” ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์„ ๋ณธ๋‹ค๋Š” ๋ง์ž…๋‹ˆ๋‹ค)

CBOW์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง

CBOW์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง์— ๋ฐํ•˜์—ฌ ํ•œ๋ฒˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ๊ทธ๋ฆผ์€ CBOW์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง์„ ์‹œ๊ฐํ™” ํ•œ๊ฒƒ์ž…๋‹ˆ๋‹ค.

CBOW์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง ์‹œ๊ฐํ™”. ์ถœ์ฒ˜:https://wikidocs.net/22660

  • Input layer(์ž…๋ ฅ์ธต)์— ์ž…๋ ฅ๊ฐ’์„ ๋„ฃ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ๊ฐ’๋“ค์€ ์•ž, ๋’ค๋กœ ์ •ํ•ด์ ธ ์žˆ๋Š” Window(์œˆ๋„์šฐ)์˜ ํฌ๊ธฐ ๋ฒ”์œ„ ์•ˆ์— ์žˆ๋Š” ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค์˜  One-Hot Vector๊ฐ€ ๋“ค์–ด๊ฐ‘๋‹ˆ๋‹ค.
  • Output layer(์ถœ๋ ฅ์ธต)์—์„œ ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ์ค‘๊ฐ„ ๋‹จ์–ด์˜ One-Hot Vector๊ฐ€ label๋กœ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  ์œ„์˜ ๊ทธ๋ฆผ์—์„œ ์•Œ์ˆ˜ ์žˆ๋Š”๊ฑด Word2Vec์€ Hidden Layer(์€๋‹‰์ธต)์ด 1๊ฐœ์ธ Shallow Neural Network(์–•์€ ์‹ ๊ฒฝ๋ง)์ด๋ผ๋Š” ์ ์ž…๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  Word2Vec์˜ ํŠน์ง•์ค‘ ํ•˜๋‚˜๋Š”Word2Vec์—์„œ์˜ Hidden Layer(์€๋‹‰์ธต)๋Š” Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์œผ๋ฉฐ Lookup Table(๋ฃฉ์—… ํ…Œ์ด๋ธ”)์ด๋ผ๋Š” ์—ฐ์‚ฐ์„ ๋‹ด๋‹นํ•˜๋Š” ์ธต, Projection Layer(ํˆฌ์‚ฌ์ธต)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

CBOW์˜ ๋™์ž‘ ๋ฉ”์ปค๋‹ˆ์ฆ˜

CBOW์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง์„ ํ™•๋Œ€. ์ถœ์ฒ˜:https://wikidocs.net/22660

CBOW์˜ ๋™์ž‘ ๋ฉ”์ปค๋‹ˆ์ฆ˜์— ๋ฐํ•˜์—ฌ ํ•œ๋ฒˆ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

  • ์—ฌ๊ธฐ์„œ ๋ด์•ผํ•˜๋Š”๊ฑด ์ผ๋‹จ Projection Layer(ํˆฌ์‚ฌ์ธต)์˜ ํฌ๊ธฐ๊ฐ€ M์ด๋ผ๋Š” ์ ์ž…๋‹ˆ๋‹ค. 
  • CBOW์—์„œ์˜ Projection Layer(ํˆฌ์‚ฌ์ธต)์˜ ํฌ๊ธฐ M์€ Embedding ํ•˜๊ณ ๋‚œ Vector์˜ ์ฐจ์›์ด ๋’ต๋‹ˆ๋‹ค.
  • ์œ„์˜ ๊ทธ๋ฆผ์—์„œ์˜ Projection Layer(ํˆฌ์‚ฌ์ธต)์˜ ํฌ๊ธฐ๋Š” 5์ด๋ฏ€๋กœ CBOW๋ฅผ ํ•˜๊ณ  ์–ป๋Š” ๊ฐ ๋‹จ์–ด์˜ Embedding Vector์˜ ์ฐจ์›์€ 5์ž…๋‹ˆ๋‹ค.
+ Embedding: ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์น˜์˜ ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•. ์œ ์‚ฌํ•œ ๊ฐ์ฒด๋“ค์ด Vector ๊ณต๊ฐ„์— ๊ฐ€๊นŒ์ด ์œ„์น˜ํ•˜๋„๋ก ํ•˜๋Š” ๋ฐฉ์‹
๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๊ฐ€ ํ•˜๋‚˜ ๋” ์ฃผ๋ชฉํ•ด์•ผ ํ•  ์ ๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Input Layer(์ž…๋ ฅ์ธต) & Projection Layer(ํˆฌ์‚ฌ์ธต) ์‚ฌ์ด์˜ Weight(๊ฐ€์ค‘์น˜, ์—ฌ๊ธฐ์„  W)๋Š” V * M ํ–‰๋ ฌ ์ž…๋‹ˆ๋‹ค.
  • Projection Layer(ํˆฌ์‚ฌ์ธต)์‚ฌ์ด์˜ Weight(๊ฐ€์ค‘์น˜) W'์€ M * V ํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
    • V๋Š” Context(๋‹จ์–ด) ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ Vector๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
    • M์€ ์•ž์—์„œ ์„ค๋ช…ํ–ˆ์ง€๋งŒ Projection Layer(ํˆฌ์‚ฌ์ธต)์˜ ํฌ๊ธฐ ์ž…๋‹ˆ๋‹ค.
  • ์œ„์˜ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด One-Hot Vector์˜ Dimension(์ฐจ์›)์ด 7์ด๊ณ , M(ํˆฌ์‚ฌ์ธต์˜ ํฌ๊ธฐ)๊ฐ€ 5์ด๋ฉด?
  • ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ W๋Š” 7 * 5 ํ–‰๋ ฌ, W'์€ 5 * 7 ํ–‰๋ ฌ์ด ๋ฉ๋‹ˆ๋‹ค. ์ด ๋‘ ํ–‰๋ ฌ์€ ์„œ๋กœ ๋‹ค๋ฅธ ํ–‰๋ ฌ์ž…๋‹ˆ๋‹ค.
  • ์ธ๊ณต ์‹ ๊ฒฝ๋ง์„ ํ›ˆ๋ จํ• ๋•Œ์˜ ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ W & W'์€ Random Value๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 
  • CBOW๋Š” ์ฃผ๋ณ€ ๋‹จ์–ด๋กœ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋ฅผ ์ž˜ ๋งž์ถ”๊ธฐ ์œ„ํ•ด์„œ ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ W & W'๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

Lookup Table (๋ฃฉ์—… ํ…Œ์ด๋ธ”)

์ž…๋ ฅ์œผ๋กœ ๋“ค์–ด์˜ค๋Š” ์ฃผ๋ณ€ ๋‹จ์–ด์˜ One-Hot Vector์™€ ๊ฐ€์ค‘์น˜ W ํ–‰๋ ฌ์˜ ๊ณฑ. ์ถœ์ฒ˜:https://wikidocs.net/22660

Input์œผ๋กœ ๋“ค์–ด์˜ค๋Š” ์ฃผ๋ณ€ ๋‹จ์–ด์˜ One-Hot Vector์™€ ๊ฐ€์ค‘์น˜ Wํ–‰๋ ฌ์˜ ๊ณฑ์„ ์–ด๋–ป๊ฒŒ ํ•˜๋Š”์ง€ ํ•œ๋ฒˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
  • Input Vector(์ž…๋ ฅ ๋ฒกํ„ฐ)๋Š” ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค์˜ One-Hot Vector๋ฅผ x๋กœ ํ‘œํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • i๋ฒˆ์งธ index์— 1์ด๋ผ๋Š” ๊ฐ’์ด ์žˆ๊ณ , ๊ทธ์™ธ์˜ 0์˜ ๊ฐ’์„ ๊ฐ€์ง€๋Š” Input Vector(์ž…๋ ฅ ๋ฒกํ„ฐ)์™€ Weight(๊ฐ€์ค‘์น˜) ํ–‰๋ ฌ W์˜ ๊ณฑ์€ Wํ–‰๋ ฌ์˜ i๋ฒˆ์งธ ํ–‰๋ ฌ์„ ๊ทธ๋Œ€๋กœ ๊ฐ€์ ธ์˜ค๋Š”๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋ฌด์Šจ๋ง์ด๋ƒ?
    • ํ•œ๋ฒˆ ์˜ˆ์‹œ๋ฅผ ๋“ค์–ด์„œ ์„ค๋ช…์„ ํ•ด๋ณด๋ฉด One-Hot Vector๊ฐ€ [0, 0, 1, 0]์ด๊ณ , Weight(๊ฐ€์ค‘์น˜) ํ–‰๋ ฌ W๋ž‘ ๊ณฑํ•œ๋‹ค๊ณ  ํ•˜๋ฉด, ์‹ค์ œ๋กœ๋Š” W์˜ ์„ธ๋ฒˆ์งธ ํ–‰๋งŒ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. 
    • ์ด์œ ๋Š” Vector์˜ ๊ฐ ์š”์†Œ๋Š” ํ–‰๋ ฌ์˜ ํ•ด๋‹น ํ–‰์— ๊ณฑํ•ด์ง€๋Š”๋ฐ, One-Hot Vector์—์„œ์˜ 1 ์ด์˜์˜ ๋ชจ๋“  ๊ฐ’์€ 0์ด๊ธฐ ๋•Œ๋ฌธ์—, 0์ด ๊ณฑํ•ด์ง„ ํ–‰์€ ๋ชจ๋‘ ์‚ฌ๋ผ์ง€๊ณ , 1์— ๊ณฑํ•ด์ง„ ํ–‰๋งŒ ๋‚จ์Šต๋‹ˆ๋‹ค.
    • ์ด๋ž˜์„œ One-Hot Encoding๋œ Vector๋ฅผ Weight(๊ฐ€์ค‘์น˜) ํ–‰๋ ฌ๊ณผ ๊ณฑํ•˜๋Š”๊ฒƒ์€ ํ•ด๋‹น Vector์—์„œ 1์˜ ์œ„์น˜์— ํ•ด๋‹นํ•˜๋Š” Weight(๊ฐ€์ค‘์น˜) ํ–‰๋ ฌ์˜ ํ–‰์„ ์„ ํƒํ•˜๋Š”๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
  • ์ด ์ž‘์—…์„ Lookup Table(๋ฃฉ์—… ํ…Œ์ด๋ธ”)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ Lookupํ•œ W์˜ ๊ฐ ํ–‰ Vector๊ฐ€ Word2Vec ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด์„œ ํ•™์Šตํ•œ ํ›„ ๊ฐ ๋‹จ์–ด์˜ M ์ฐจ์›์˜ ์ž„๋ฒ ๋”ฉ Vector๋กœ ๊ฐ„์ฃผ๋ฉ๋‹ˆ๋‹ค.

์ถœ์ฒ˜:https://wikidocs.net/22660

  • ์ด๋ ‡๊ฒŒ ์ฃผ๋ณ€ ๋‹จ์–ด์˜ One-Hot Vector์— ๋Œ€ํ•ด์„œ Weight(๊ฐ€์ค‘์น˜) W๊ฐ€ ๊ณฑํ•ด์ ธ ์ƒ๊ฒจ์ง„ ๊ฒฐ๊ณผ Vector๋“ค์€ Projection Layer(ํˆฌ์‚ฌ์ธต)์—์„œ Vector๋“ค์˜ ํ‰๊ท ์„ ๋‚ด์–ด์„œ ํ‰๊ท  Vector๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    • ์˜ˆ์‹œ๋ฅผ ๋“ค๋ฉด, ๋งŒ์•ฝ์˜ ์œˆ๋„์šฐ ํฌ๊ธฐ๊ฐ€ 2, n = 2 ์ด๋ฉด, Input Vector(์ž…๋ ฅ ๋ฒกํ„ฐ)์˜ ์ด ๊ฐœ์ˆ˜๋Š” 2n์ด๋ฏ€๋กœ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๋ฉด ์ด 4๊ฐœ๊ฐ€ Input Vector(์ž…๋ ฅ ๋ฒกํ„ฐ)๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
    • ๊ทธ๋Ÿฌ๋ฉด ํ‰๊ท ์„ ๊ตฌํ• ๋•Œ 4๊ฐœ์˜ Vector๋ฅผ ๊ฐ€์ง€๊ณ  ํ‰๊ท ์„ ๊ตฌํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • Projection Layer(ํˆฌ์‚ฌ์ธต)์—์„œ Vector์˜ ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๋ฐฉ์‹์€ CBOW์™€ Skip-Gram์ด ์„œ๋กœ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.
    • ์ฐธ๊ณ ๋กœ Skip-Gram์€ Input์œผ๋กœ ๋“ค์–ด์˜ค๋Š” ๊ฐ’์ด Center Word(์ค‘์‹ฌ ๋‹จ์–ด)ํ•˜๋‚˜์ด๋ฏ€๋กœ, Projection Layer(ํˆฌ์‚ฌ์ธต)์—์„œ Vector์ด ํ‰๊ท ์„ ๊ตฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ถœ์ฒ˜:https://wikidocs.net/22660

์ด๋ ‡๊ฒŒ ํ‰๊ท  ๋ฒกํ„ฐ๋ฅผ ๊ตฌํ•˜๋ฉด ๋‘๋ฒˆ์ฉจ Weight(๊ฐ€์ค‘์น˜) ํ–‰๋ ฌ W'์™€ ๊ณฑํ•ด์ง‘๋‹ˆ๋‹ค.
  • ๊ณฑํ•˜๋ฉด, One-Hot Vector๊ณผ Dimension(์ฐจ์›)์ด ๋™์ผํ•œ V ๋ฒกํ„ฐ๊ฐ€ ๋‚˜์˜ต๋‹ˆ๋‹ค.
  • ๋งŒ์•ฝ Input Vector(์ž…๋ ฅ ๋ฒกํ„ฐ)์˜ Dimension(์ฐจ์›)์ด 7์ด๋ฉด Output Vector๋„ 7์ด ๋‚˜์˜ต๋‹ˆ๋‹ค.
  • CBOW ์•Œ๊ณ ๋ฆฌ์ฆ˜์—์„œ๋Š” Softmax ํ•จ์ˆ˜๋ฅผ ์ง€๋‚˜๋ฉด์„œ Vector์˜ ๊ฐ ์›์†Œ์˜ ๊ฐ’์€ 0๊ณผ 1์‚ฌ์ด์˜ ์‹ค์ˆ˜ํ˜•(Float)ํ˜•ํ…Œ๋กœ ๋‚˜์˜ค๊ณ , Vector๋“ค์˜ ์ดํ•ฉ์€ 1๋กœ ๋‚˜์˜ต๋‹ˆ๋‹ค. ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ผ์ข…์˜ Score Vector์™€ ๊ฐ™์€ ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.
Softmax ํ•จ์ˆ˜: ํ™•๋ฅ ์ ์ธ ๊ด€์ ์—์„œ ์—ฌ๋Ÿฌ ์„ ํƒ์ง€์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋จ. ์ฃผ์–ด์ง„ ์ž…๋ ฅ๊ฐ’๋“ค์— ๋Œ€ํ•ด์„œ ๊ฐ๊ฐ์˜ ํ™•๋ฅ ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๊ณ , ์ด๋“ค ๊ฐ’๋“ค์˜ ์ดํ•ฉ์ด 1๋กœ ๋˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. -> ์—ฌ๊ธฐ์„œ ์ถœ๋ ฅ๊ฐ’์€ ํ™•๋ฅ ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Score Vector์˜ J๋ฒˆ์งธ Index๊ฐ€ ๊ฐ€์ง„ 0~ 1 ์‚ฌ์ด์˜ ๊ฐ’์€ j๋ฒˆ์งธ ๋‹จ์–ด๊ฐ€ ๊ทธ ๋ฒกํ„ฐ์˜ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์ผ ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ  Score Vector์˜ ๊ฐ’์€ Label์˜ ํ•ด๋‹นํ•˜๋Š” Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์˜ One-Hot Vector ๊ฐ’์— ๊ฐ€๊นŒ์›Œ ์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒŒ ๋ฌด์Šจ ๋ง์ด๋ƒ..? ํ•œ๋ฒˆ ์„ค๋ช…ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
    • Score Vector๋ฅผ y^ ๋ผ๊ณ  ํ•˜๊ณ , Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์˜ One-Hot Vector ๊ฐ’์„ y๋ผ๊ณ  ํ–ˆ์„๋•Œ, ์ด ๋‘ vector์˜ ์˜ค์ฐจ ๊ฐ’์„์ค„์ด๊ธฐ ์œ„ํ•ด์„œ CBOW ์•Œ๊ณ ๋ฆฌ์ฆ˜์—์„œ๋Š” loss function(์†์‹คํ•จ์ˆ˜)๋กœ Cross-entropy ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
Cross-Entropy ํ•จ์ˆ˜: ๋”ฅ๋Ÿฌ๋‹์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๋Š” Loss Function(์†์‹คํ•จ์ˆ˜), ์›๋ณธ์˜ ๋ถ„ํฌ๊ฐ’, ์˜ˆ์ธก์˜ ๋ถ„ํฌ๊ฐ’ ์‚ฌ์ด์˜ ์ฐจ์ด๋ฅผ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•. ์›๋ณธ์˜ ๋ถ„ํฌ๊ฐ’, ์˜ˆ์ธก์˜ ๋ถ„ํฌ๊ฐ’ ์‚ฌ์ด์˜ ์ฐจ์ด๊ฐ€ ํฌ๋ฉด Cross-Entropy์˜ ๊ฐ’์ด ๋†’์•„์ง€๊ณ , ์•„๋‹ˆ๋ฉด ์ž‘์•„์ง‘๋‹ˆ๋‹ค.
  • Cross-entropy ํ•จ์ˆ˜์— Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์ธ One-Hot Vector์™€ Score Vector๋ฅผ Input์œผ๋กœ ๋„ฃ์Šต๋‹ˆ๋‹ค.
  • ์ด ๊ณผ์ •์„ ์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ์ˆ˜์‹์—์„œ V๋Š” ๋‹จ์–ด ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ์ž…๋‹ˆ๋‹ค. 

์ถœ์ฒ˜:https://wikidocs.net/22660

  • Back Propagation(์—ญ์ „ํŒŒ)์„ ํ•˜๋ฉด ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ W, W'์ด ํ•™์Šต์ด ๋˜๋Š”๋ฐ, ํ•™์Šต์ด ๋‹ค ๋˜๋ฉด M ์ฐจ์›์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ W์˜ ํ–‰์„ ๊ฐ ๋‹จ์–ด์˜ Embedding Vector๋กœ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜, W, W' ํ–‰๋ ฌ 2๊ฐ€์ง€๋ฅผ ๋ชจ๋‘ ๊ฐ€์ง€๊ณ  Embedding Vector๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

5. Skip-Gram

CBOW์—์„œ๋Š” ์ฃผ๋ณ€๋‹จ์–ด ๋ฅผ ํ†ตํ•ด์„œ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)๋ฅผ ์˜ˆ์ธกํ–ˆ๋‹ค๋ฉด, Skip-gram์€ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์—์„œ ์ฃผ๋ณ€ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ๋ฅผ ๋“ค์–ด์„œ ํ•œ๋ฒˆ ์„ค๋ช…ํ•ด๋ณด๋ฉด, "๊ณ ์–‘์ด๊ฐ€ ์ฅ๋ฅผ ์ซ’๋Š”๋‹ค" ๋ผ๊ณ  ํ•˜๋ฉด, "์ฅ๋ฅผ"์ด๋ผ๋Š” ๋‹จ์–ด๋ฅผ ๋ฐ›์•„ ๊ทธ ์ฃผ๋ณ€์˜ ๋‹จ์–ด๋“ค ("๊ณ ์–‘์ด๊ฐ€", "์ซ“๋Š”๋‹ค")๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๊ฑธ ๋ณด๋ฉด Skip-Gram Model์€ ๋‹จ์–ด๋“ค ์‚ฌ์ด์˜ ์—ฐ๊ด€์„ฑ์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์—ฐ๊ด€์„ฑ์€ ๋‹จ์–ด Embedding์ด๋ผ๊ณ  ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.
    • ๊ทธ๋ฆฌ๊ณ  ์ด Embedding์€ ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ์ˆ˜์น˜์  Vector๋กœ ํ‘œํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ์ˆ˜์น˜์  Vector๋Š” ๋‹จ์–ด ์‚ฌ์ด์˜ ์œ ์‚ฌ์„ฑ์„ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • Window(์œˆ๋„์šฐ)์˜ ํฌ๊ธฐ๊ฐ€ 2์ด๋ฉด, ๋ฐ์ดํ„ฐ์…‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

Skip-Gram ๋ฐ์ดํ„ฐ์…‹ ์˜ˆ์‹œ. ์ถœ์ฒ˜:https://wikidocs.net/22660

  • ์ธ๊ณต์‹ ๊ฒฝ๋ง์„ ์‹œ๊ฐํ™” ํ•ด๋ณด๋ฉด ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Skip-Gram ์ธ๊ณต์‹ ๊ฒฝ๋ง. ์ถœ์ฒ˜:https://wikidocs.net/22660

  • Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์—์„œ ์ฃผ๋ณ€๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ๋•Œ๋ฌธ์—, Projection Layer(ํˆฌ์‚ฌ์ธต)์—์„œ Vector์˜ ํ‰๊ท ์„ ๊ตฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ์ด์œ ๋Š” Skip-Gram Model์€ ๊ฐ๊ฐ์˜ ์ฃผ๋ณ€ ๋‹จ์–ด๋ฅผ ๋…๋ฆฝ์ ์œผ๋กœ Predict(์˜ˆ์ธก)ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  • Skip-Gram Model์€ Center Word(์ค‘์‹ฌ ๋‹จ์–ด)์—์„œ ์ฃผ๋ณ€๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ• ๋•Œ, ๊ฐ ์ฃผ๋ณ€๋‹จ์–ด์— ๋Œ€ํ•œ ์˜ˆ์ธก์€ ๋…๋ฆฝ์ ์œผ๋กœ ์ด๋ฃจ์–ด ์ง€๋ฏ€๋กœ, ํ•œ ๋‹จ์–ด์˜ ์˜ˆ์ธก์ด ๋‹ค๋ฅธ ๋‹จ์–ด์˜ ์˜ˆ์ธก์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.