๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] ํ•ฉ์„ฑ๊ณฑ, ์ˆœํ™˜์‹ ๊ฒฝ๋ง, Encoder, Decoder์—์„œ ์ˆ˜ํ–‰ํ•˜๋Š” Self-Attention

์ „์— ์ผ๋˜ ๋‚ด์šฉ์— ์ด์–ด์„œ ์จ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง (CNN Model)๊ณผ ๋น„๊ตํ•œ Self-Attention CNN์€ *Convolution filter(ํ•ฉ์„ฑ๊ณฑ ํ•„ํ„ฐ)๋ผ๋Š” ํŠน์ˆ˜ํ•œ ์žฅ์น˜๋ฅผ ์ด์šฉํ•ด์„œ Sequence์˜ ์ง€์—ญ์ ์ธ ํŠน์ง•์„ ์žก์•„๋‚ด๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ Convolution filter(ํ•ฉ์„ฑ๊ณฑ ํ•„ํ„ฐ)๋Š” ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑํ•˜๋Š” ํ•˜๋‚˜์˜ ์š”์†Œ-ํ•„ํ„ฐ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ „์ฒด์ ์œผ๋กœ ํ›‘์œผ๋ฉด์„œ ์ธ์ ‘ํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ž์—ฐ์–ด๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ Sequence(๋‹จ์–ด ํ˜น์€ ํ˜•ํƒœ์†Œ์˜ ๋‚˜์—ด)์ด๊ณ  ํŠน์ • ๋‹จ์–ด ๊ธฐ์ค€ ์ฃผ๋ณ€ ๋ฌธ๋งฅ์ด ์˜๋ฏธ ํ˜•์„ฑ์— ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ, CNN์ด ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ์— ๋„๋ฆฌ ์“ฐ์ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์œ„์˜ ๊ทธ๋ฆผ์€ CNN ๋ฌธ์žฅ์˜ Encoding ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. Convolution filter(ํ•ฉ์„ฑ๊ณฑ ํ•„ํ„ฐ)๊ฐ€ ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Attention - ์–ดํ…์…˜

1. Attention Attention์€ CS ๋ฐ ML์—์„œ ์ค‘์š”ํ•œ ๊ฐœ๋…์ค‘ ํ•˜๋‚˜๋กœ ์—ฌ๊ฒจ์ง‘๋‹ˆ๋‹ค. Attention์˜ ๋งค์ปค๋‹ˆ์ฆ˜์€ ์ฃผ๋กœ Sequence Data๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ฑฐ๋‚˜ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. -> Sequence ์ž…๋ ฅ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ํ•™์Šต ๋ฐฉ๋ฒ•์˜ ์ผ์ข… Attention์˜ ๊ฐœ๋…์€ Decoder์—์„œ ์ถœ๋ ฅ์„ ์˜ˆ์ธกํ•˜๋Š” ๋งค์‹œ์ (time step)๋งˆ๋‹ค, Encoder์—์„œ์˜ ์ „์ฒด์˜ ์ž…๋ ฅ ๋ฌธ์žฅ์„ ๋‹ค์‹œ ํ•œ๋ฒˆ ์ฐธ๊ณ ํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋‹จ, ์ „์ฒด ์ž…๋ ฅ ๋ฌธ์žฅ์„ ์ „๋ถ€ ๋‹ค ์ข…์ผํ•œ ๋น„์œจ๋กœ ์ฐธ๊ณ ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ํ•ด๋‹น ์‹œ์ ์—์„œ ์˜ˆ์ธกํ•ด์•ผ ํ•  ์š”์†Œ์™€ ์—ฐ๊ด€์ด ์žˆ๋Š” ์ž…๋ ฅ ์š”์†Œ ๋ถ€๋ถ„์„ Attention(์ง‘์ค‘)ํ•ด์„œ ๋ณด๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์ด ๋ฌธ๋งฅ์„ ํŒŒ์•…ํ•˜๋Š” ํ•ต์‹ฌ์˜ ๋ฐฉ๋ฒ•์ด๋ฉฐ, ์ด๋Ÿฌํ•œ ๋ฐฉ์‹์„ DL(๋”ฅ๋Ÿฌ๋‹)๋ชจ๋ธ์— ์ ์šฉํ•œ๊ฒƒ์ด 'Attent..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Word Embedding - ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ

1. Word Embedding? Word Embedding, ์›Œ๋“œ์ž„๋ฒ ๋”ฉ ์ด๋ž€? ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์น˜ํ˜• ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์˜๋ฏธ๋กœ ๋งํ•˜๋ฉด Text๋‚ด์˜ ๋‹จ์–ด๋“ค์„ ์ปดํ“จํ„ฐ๊ฐ€ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” Vector์˜ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜๋Š”๊ฒƒ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋‹จ์–ด๋ฅผ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์˜ ์ €์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. Word Embedding ๊ณผ์ •์„ ๊ฑฐ์นœ Vector๋Š” ๋‹จ์–ด์˜ ์˜๋ฏธ(mean), ๋ฌธ๋งฅ(context), ์œ ์‚ฌ์„ฑ(similar) ๋“ฑ์„ ์ˆ˜์น˜ํ™” ํ•ด์„œ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ์˜ ๊ณผ์ •์€ ํฌ๊ฒŒ ๋ณด๋ฉด 2๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์œผ๋กœ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค. 2. Word Embedding์˜ ๋ฐฉ๋ฒ• Word Embedding์˜ ๋ฐฉ๋ฒ•์€ ํฌ๊ฒŒ ๋ณด๋ฉด 2๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์œผ๋กœ ์ด๋ฃจ์–ด ์ง„๋‹ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” Count๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•, ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ์˜ˆ์ธก ๊ธฐ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Word2Vec, CBOW, Skip-Gram - ๊ฐœ๋… & Model

1. What is Word2Vec? Word2Vec์€ ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์ธ๊ธฐ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋‹จ์–ด๋Š” ๋ณดํ†ต 'Token' ํ† ํฐ ์ž…๋‹ˆ๋‹ค. ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋‹จ์–ด(Token)๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ์  ๊ด€๊ณ„๋ฅผ Vector ๊ณต๊ฐ„์— ์ž˜ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ํ•™์Šตํ•˜๋Š” ๋น„์ง€๋„๋ฐฉ์‹(Unsupervised learning)์œผ๋กœ ์„ค๊ณ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ž…๋‹ˆ๋‹ค. ์ฃผ๋ณ€ ๋‹จ์–ด๋“ค(๋ฌธ๋งฅ)์„ ํ†ตํ•ด์„œ ๊ฐ ๋‹จ์–ด๋“ค์„ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜, ๋ฐ˜๋Œ€๋กœ ๊ฐ ๋‹จ์–ด๋“ค์„ ํ†ตํ•ด ์ฃผ๋ณ€์˜ ๋‹จ์–ด๋“ค์„ ๋ณด๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋น„์œ  ํ•˜์ž๋ฉด ์ด๋ฏธ์ง€๋ฅผ ํ•™์Šตํ•˜๋“ฏ, ๋‹จ์–ด๋ฅผ Vector๋กœ ๋ณด๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ Word2Vec์€ ๋‹จ์–ด๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ์ ์ธ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ , ์œ„์˜ ๊ทธ๋ฆผ์— ์žˆ๋Š” ๋ฌธ์žฅ์„ ์ด์šฉํ•ด ๋ชจ๋ธ์„ ํ•™์Šต ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ ๊ฐ ๋‹จ์–ด(Token..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] GRU Model - LSTM Model์„ ๊ฐ€๋ณ๊ฒŒ ๋งŒ๋“  ๋ชจ๋ธ

1. GRU Model์€ ๋ฌด์—‡์ผ๊นŒ? GRU (Gated Recurrent Unit)๋Š” ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN)์˜ ํ•œ ์ข…๋ฅ˜๋กœ, ์•ž์—์„œ ์„ค๋ช…ํ•œ LSTM(Long Short-Term Memory)๋ชจ๋ธ์˜ ๋‹จ์ˆœํ™”๋œ ํ˜•ํƒœ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. GRU Model์€ LSTM Model๊ณผ ๋น„์Šทํ•œ ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•˜์ง€๋งŒ, ๋” ๊ฐ„๋‹จํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. LSTM Model์˜ ์žฅ์ ์„ ์œ ์ง€ํ•˜๋˜, Gate(๊ฒŒ์ดํŠธ)์˜ ๊ตฌ์กฐ๋ฅผ ๋‹จ์ˆœํ•˜๊ฒŒ ๋งŒ๋“  ๋ชจ๋ธ์ด GRU Model ์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ GRU, LSTM Model์€ ๋‘˜๋‹ค Long-Term Dependency(์žฅ๊ธฐ ์˜์กด์„ฑ) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋งŒ๋“ค์–ด ์กŒ์Šต๋‹ˆ๋‹ค. LSTM Model์„ ์„ค๋ช…ํ•œ ๊ธ€์—์„œ ์„ค๋ช…ํ–ˆ์ง€๋งŒ LSTM Model์€ "Cell State(์…€ ์ƒํƒœ)"์™€ "Hidden state(์ˆจ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] LSTM - Long Short Term Memory Model

1. LSTM Model์€ ๋ฌด์—‡์ผ๊นŒ?LSTM์€ Long Short-Term Memory์˜ ์•ฝ์ž์ž…๋‹ˆ๋‹ค. RNN - Recurrent Neural Network (์ˆœํ™˜ ์‹ ๊ฒฝ๋ง)์˜ ๋ฌธ์ œ์ธ Long-Term Dependency (์žฅ๊ธฐ ์˜์กด์„ฑ) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.๊ธฐ์กด์˜ RNN(์ˆœํ™˜ ์‹ ๊ฒฝ๋ง)๋ชจ๋ธ์€ ์‹œ๊ฐ„ & ๊ณต๊ฐ„์  ํŒจํ„ด์„ ํ•™์Šตํ•˜๊ณ  ์˜ˆ์ธกํ•˜๋Š”๋ฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ˆœ์ฐจ์ ์ธ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š”๋ฐ์—๋Š” ๊ฐ•์ ์ด ์žˆ๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.๋‹ค๋งŒ Long-Term Dependency(์žฅ๊ธฐ ์˜์กด์„ฑ) ๋ฌธ์ œ๊ฐ€ ์žˆ์–ด์„œ ๊ธด Sequence์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š”๋ฐ ์–ด๋ ค์›€์ด ์žˆ์Šต๋‹ˆ๋‹ค.Long-Term Dependency(์žฅ๊ธฐ ์˜์กด์„ฑ)์— ๋Œ€ํ•œ ์„ค๋ช…์€ ์•„๋ž˜์˜ ๊ธ€์— ์ ํ˜€์žˆ์œผ๋‹ˆ๊นŒ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. [NLP] Vanilla RNN Model, Lo..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Vanilla RNN Model, Long-Term Dependency - ์žฅ๊ธฐ ์˜์กด์„ฑ ๋ฌธ์ œ

1. ๊ธฐ๋ณธ RNN ๋ชจ๋ธ (Vanilla RNN Model)์˜ ํ•œ๊ณ„RNN๋ถ€๋ถ„์„ ์„ค๋ช…ํ•œ ๊ธ€์—์„œ ๊ธฐ๋ณธ RNN Model์„ ์•Œ์•„๋ณด๊ณ  ๊ตฌํ˜„ํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.๋ณดํ†ต RNN Model์„ ๊ฐ€์žฅ ๋‹จ์ˆœํ•œ ํ˜•ํƒœ์˜ RNN ์ด๋ผ๊ณ  ํ•˜๋ฉฐ ๋ฐ”๋‹๋ผ RNN (Vanilla RNN)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.๊ทผ๋ฐ, Vanilla RNN ๋ชจ๋ธ์— ๋‹จ์ ์œผ๋กœ ์ธํ•˜์—ฌ, ๊ทธ ๋‹จ์ ๋“ค์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ RNN ๋ณ€ํ˜• Model์ด ๋‚˜์™”์Šต๋‹ˆ๋‹ค.๋Œ€ํ‘œ์ ์œผ๋กœ LSTM, GRU ๋ชจ๋ธ์ด ์žˆ๋Š”๋ฐ, ์ผ๋‹จ ์ด๋ฒˆ๊ธ€์—์„œ๋Š” LSTM Model์— ๋Œ€ํ•œ ์„ค๋ช…์„ ํ•˜๊ณ , ๋‹ค์Œ ๊ธ€์—์„œ๋Š” GRU Model์— ๋Œ€ํ•˜์—ฌ ์„ค๋ช…์„ ํ•˜๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.Vanilla RNN์€ ์ด์ „์˜ ๊ณ„์‚ฐ ๊ฒฐ๊ณผ์— ์˜์กดํ•˜์—ฌ ์ถœ๋ ฅ ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด ๋ƒ…๋‹ˆ๋‹ค.์ด๋Ÿฌํ•œ ๋ฐฉ์‹์€ Vanilla RNN์€ ์งง์€ Sequence์—๋Š” ํšจ๊ณผ๊ฐ€ ์žˆ์ง€๋งŒ, ๊ธด..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] RNN (Recurrent Netural Network) - ์ˆœํ™˜์‹ ๊ฒฝ๋ง

1. RNN ์ด๋ž€?RNN์€ Sequence data๋ฅผ ์ฒ˜๋ฆฌ ํ•˜๊ธฐ ์œ„ํ•œ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ ์ž…๋‹ˆ๋‹ค.์ฃผ๋กœ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ(NLP)๋ฅผ ํฌํ•จํ•œ ์—ฌ๋Ÿฌ Sequence Modeling ์ž‘์—…์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.ํŠน์ง•์œผ๋กœ๋Š” ์‹œ๊ฐ„์ , ๊ณต๊ฐ„์  ์ˆœ์„œ ๊ด€๊ณ„์— ์˜ํ•˜์—ฌ Context๋ฅผ ๊ฐ€์ง€๋Š” ํŠน์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.๐Ÿ’ก exampleI want to have an apple์ด 'apple'์— ํ•œ๋ฒˆ ์ฃผ๋ชฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.์ด apple์ด๋ผ๋Š” ๋‹จ์–ด๋Š” ๋ฌธ๋งฅ์ด ํ˜•์„ฑํ•˜๋Š” ์ฃผ๋ณ€์˜ ๋‹จ์–ด๋“ค์„ ํ•จ๊ป˜ ์‚ดํŽด๋ด์•ผ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.2. RNN์— ๋Œ€ํ•˜์—ฌRNN์˜ ํŠน์ง•์€ ์–ด๋–ค๊ฒƒ์ด ์žˆ์„๊นŒ์š”?RNN์€ ์€๋‹‰์ธต(hidden layer)์˜ node์—์„œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(activation function)์„ ํ†ตํ•ด ๋‚˜์˜จ ๊ฒฐ๊ณผ๊ฐ’์„ ์ถœ๋ ฅ์ธต ๋ฐฉํ–ฅ์œผ๋กœ ๋ณด๋‚ด๋ฉด์„œ, hidden layer node์˜ ๋‹ค์Œ ๊ณ„์‚ฐ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Seq2Seq, Encoder & Decoder

1..sequence-to-sequence ๐Ÿ’ก ํŠธ๋žœ์Šคํฌ๋จธ(Transformer) ๋ชจ๋ธ์€ ๊ธฐ๊ณ„ ๋ฒˆ์—ญ(machine translation) ๋“ฑ ์‹œํ€€์Šค-ํˆฌ-์‹œํ€€์Šค(sequence-to-sequence) ๊ณผ์ œ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. sequence: ๋‹จ์–ด ๊ฐ™์€ ๋ฌด์–ธ๊ฐ€์˜ ๋‚˜์—ด์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์—ฌ๊ธฐ์„œ sequence-to-sequence๋Š” ํŠน์ • ์†์„ฑ์„ ์ง€๋‹Œ ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฅธ ์†์„ฑ์˜ ์‹œํ€€์Šค๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…(Task) ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  sequence-to-sequence๋Š” RNN์—์„œ many-to-many ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”๋ฐ, RNN์€.. ์ถ”ํ›„์— ์„ค๋ช…ํ•˜๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๐Ÿ’ก example ๊ธฐ๊ณ„ ๋ฒˆ์—ญ: ์–ด๋–ค ์–ธ์–ด(์†Œ์Šค ์–ธ์–ด, source language)์˜ ๋‹จ์–ด ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฅธ ์–ธ์–ด(๋Œ€์ƒ ์–ธ์–ด, target la..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Pre-Trained Language Model - ๋ฏธ๋ฆฌ ํ•™์Šต๋œ ์–ธ์–ด๋ชจ๋ธ

Pre-Trained Language Model - ๋ฏธ๋ฆฌ ํ•™์Šต๋œ ์–ธ์–ด๋ชจ๋ธ ๐Ÿ’ก ์–ธ์–ด ๋ชจ๋ธ(Language Model) → ๋‹จ์–ด ์‹œํ€€์Šค์— ๋ถ€์—ฌํ•˜๋Š” ๋ชจ๋ธ (๋‹จ์–ด ์‹œํ€€์Šค๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ํ•ด๋‹น ์‹œํ€€์Šค๊ฐ€ ์–ผ๋งˆ๋‚˜ ๊ทธ๋Ÿด๋“ฏํ•œ์ง€ ํ™•๋ฅ ์„ ์ถœ๋ ฅ์œผ๋กœ ํ•˜๋Š” ๋ชจ๋ธ) ๋ฌธ์žฅ์—์„œ i๋ฒˆ์งธ๋กœ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋ฅผ ๐‘คn ์ด๋ ‡๊ฒŒ ํ‘œ๊ธฐํ•˜๋ฉด n๋ฒˆ์งธ๋กœ ๋“ฑ์žฅํ•˜๋Š” ์–ธ์–ด๋ชจ๋ธ์— ๋“ฑ์žฅํ•  ํ™•๋ฅ  (์ˆ˜์‹ 1) ex) ๋‚œํญ์ด๋ผ๋Š” ๋‹จ์–ด ๋“ฑ์žฅํ›„์— ์šด์ „์ด๋ผ๋Š” ๋‹จ์–ด๊ฐ€ ๋‚˜ํƒ€๋‚  ํ™•๋ฅ ? → ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  ์ด๋ผ๊ณ  ํ•œ๋‹ค. ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  ํ‘œ๊ธฐ์‹œ ๊ฒฐ๊ณผ๊ฐ€ ๋˜๋Š” ์‚ฌ๊ฑด(์šด์ „)์„ ์•ž์—, ์กฐ๊ฑด์ด ๋˜๋Š” ์‚ฌ๊ฑด(๋‚œํญ)์€ ๋’ค์— ์“ด๋‹ค ์กฐ๊ฑด์ด ๋˜๋Š” ์‚ฌ๊ฑด์ด ์šฐ๋ณ€ ๋ถ„์ž์˜ ์ผ๋ถ€, ๊ทธ๋ฆฌ๊ณ  ์šฐ๋ณ€ ๋ถ„๋ชจ๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์žˆ์Œ์„ ๋ณผ ์ˆ˜ ์žˆ์Œ = ์ด๋Š” ๊ฒฐ๊ณผ๊ฐ€ ๋˜๋Š” ์‚ฌ๊ฑด(์šด์ „)์€ ์กฐ๊ฑด์ด ๋˜๋Š” ์‚ฌ๊ฑด(๋‚œํญ)์˜ ์˜ํ–ฅ์„ ๋ฐ›์•„ ๋ณ€ํ•œ๋‹ค๋Š” ๊ฐœ๋…์„ ๋‚ดํฌ..

Bigbread1129
'๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (2 Page)