๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

์ด๋ฒˆ์—”, BART Model์— ๋Œ€ํ•˜์—ฌ ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.What is BART?BART(Bidirectional and Auto-Regressive Transformers) ๋ชจ๋ธ์€ Facebook AI(ํ˜„ Meta AI)์—์„œ 2019๋…„์— ์†Œ๊ฐœํ•œ sequence-to-sequence ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. BART๋Š” BERT์™€ GPT์˜ ์žฅ์ ์„ ๊ฒฐํ•ฉํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. BERT ๋ชจ๋ธ์˜ Bidrectional(์–‘๋ฐฉํ–ฅ)์œผ๋กœ ์–ธ์–ด Sequence์˜ Token๋“ค์„ Attention ๋งค์ปค๋‹ˆ์ฆ˜์— ๋ฐ˜์˜ํ•˜์—ฌ ๋ฌธ์ž๋ฅผ Encoding ํ•˜๋Š” ๋‚ด์šฉ, GPT์˜ Generative Decoder๋ฅผ ํ™œ์šฉํ•œ, ์ด๋•Œ๊นŒ์ง€์˜ ์ž…๋ ฅ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ƒˆ๋กœ์šด ์ถœ๋ ฅ์„ ๋งŒ๋“œ๋Š” Generative model ์ž…๋‹ˆ๋‹ค.์ •๋ฆฌํ•˜๋ฉด, ๊ธฐ๋ณธ์˜ Sequence-to-Se..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] BERT (Bidrectional Encoder Representations from Transformers)

LLM๊ด€๋ จ ์Šคํ„ฐ๋””๋ฅผ ํ•˜๊ณ  ์žˆ๋Š”๋ฐ, BERT ๋ชจ๋ธ์— ๋ฐํ•˜์—ฌ ๊ณต๋ถ€๋ฅผ ํ•ด์•ผํ•  ํƒ€์ด๋ฐ์ด์—ฌ์„œ, ํ•˜๋Š”๊น€์— ๋‚ด์šฉ๋„ ์ •๋ฆฌํ•ด ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.๊ทธ๋ฆฌ๊ณ  BERT Model์— ๋ฐํ•˜์—ฌ ์ดํ•ด๋ฅผ ํ•˜๋ ค๋ฉด Transformer๋ผ๋Š” ๋ชจ๋ธ์— ๋ฐํ•˜์—ฌ ์–ด๋Š์ •๋„ ์ดํ•ด๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.์•„๋ž˜ ์ฐธ๊ณ ๊ธ€ ๋‚จ๊ธธํ…Œ๋‹ˆ ํ•œ๋ฒˆ ๋ณด๊ณ  ์ด ๊ธ€์„ ์ฝ์–ด์ฃผ์„ธ์š”!! [NLP] Transformer Model - ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ ์•Œ์•„๋ณด๊ธฐ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Transformer ๋ชจ๋ธ์˜ ์ „๋ฐ˜์ ์ธ Architecture ๋ฐ ๊ตฌ์„ฑ์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. Transformer: Attention is All You Need Transformer ๋ชจ๋ธ์€ 2017๋…„์— "Attention is All You Need"๋ผ๋Š” ๋…ผ๋ฌธ์„ ํ†ตํ•ด์„œ ์†Œ๊ฐœ๋˜์—ˆ์Šตdaehyun-bigbread.tistory.c..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Generative Adversarial Networks (์ƒ์„ฑ์  ์ ๋Œ€ ์‹ ๊ฒฝ๋ง, GAN)

์ƒ์„ฑ์  ์ ๋Œ€ ์‹ ๊ฒฝ๋ง (GAN)?์ƒ์„ฑ์  ์ ๋Œ€ ์‹ ๊ฒฝ๋ง(GAN)์€ ๋‘ ๊ฐœ์˜ ์‹ ๊ฒฝ๋ง์ธ ์ƒ์„ฑ์ž(Generator)์™€ ํŒ๋ณ„์ž(Discriminator)๊ฐ€ ๊ฒฝ์Ÿ์ ์œผ๋กœ ํ•™์Šตํ•˜๋ฉด์„œ, ํ˜„์‹ค๊ณผ ์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. GAN์€ ํŠนํžˆ ์ด๋ฏธ์ง€ ์ƒ์„ฑ, ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•, ๋น„๋””์˜ค ์ƒ์„ฑ ๋“ฑ ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์—์„œ ๋งค์šฐ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.์•„๋ž˜์—์„œ GAN์˜ ๊ตฌ์„ฑ ์š”์†Œ, ํ•™์Šต ์ ˆ์ฐจ, ๋ณ€ํ˜• ๋ชจ๋ธ, ์‘์šฉ ๋ถ„์•ผ, ์žฅ์ ๊ณผ ๋‹จ์ ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. GAN์˜ ๊ตฌ์„ฑ ์š”์†Œ์ƒ์„ฑ์ž(Generator):์—ญํ• : ์ƒ์„ฑ์ž๋Š” ์ž„์˜์˜ ๋…ธ์ด์ฆˆ ๋ฒกํ„ฐ๋ฅผ ๋ฐ›์•„๋“ค์—ฌ ํ˜„์‹ค๊ฐ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ ์ƒ์„ฑ์ž๋Š” ์ž…๋ ฅ๋œ ๋…ธ์ด์ฆˆ ๋ฒกํ„ฐ๋ฅผ ์ ์ฐจ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ, ์˜ˆ๋ฅผ ๋“ค์–ด ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ๊ณ ์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.๋ชฉํ‘œ: ์ƒ์„ฑ์ž์˜ ์ฃผ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] AutoEncoder (์˜คํ† ์ธ์ฝ”๋”)

NLP๋ฅผ ์˜ค๋žœ๋งŒ์— ๊ณต๋ถ€ํ•ด๋ณด๋‹ค AutoEncoder์— ๋Œ€ํ•œ ๋‚ด์šฉ์ด ์—†๋Š”๊ฑฐ ๊ฐ™์•„ ํ•œ๋ฒˆ ์„ค๋ช…ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.์˜คํ† ์ธ์ฝ”๋”(Autoencoder)๋ž€?AutoEncoder(์˜คํ† ์ธ์ฝ”๋”)๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋น„์ง€๋„ ํ•™์Šต ๋ชจ๋ธ๋กœ, ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ ํšจ์œจ์ ์ธ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์••์ถ•ํ•˜๊ณ  ์ฐจ์›์„ ์ถ•์†Œํ•˜๊ฑฐ๋‚˜, ๋…ธ์ด์ฆˆ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ์ด์ƒ ํƒ์ง€์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์˜คํ† ์ธ์ฝ”๋”(Autoencoder)์˜ ์ž‘๋™ ์›๋ฆฌ์˜คํ† ์ธ์ฝ”๋”๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์••์ถ•ํ•˜์—ฌ ์ž ์žฌ ๊ณต๊ฐ„(latent space)์ด๋ผ๋Š” ์ €์ฐจ์› ํ‘œํ˜„์œผ๋กœ ๋ณ€ํ™˜ํ•œ ๋‹ค์Œ, ์ด๋ฅผ ๋‹ค์‹œ ์›๋ž˜์˜ ๋ฐ์ดํ„ฐ๋กœ ๋ณต์›ํ•˜๋Š” ๊ณผ์ •์„ ํ†ตํ•ด ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ณผ์ •์€ ์ฃผ๋กœ ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ๋กœ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.์ธ์ฝ”๋”(Encoder): ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์ฐจ์›..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] RNNLM - RNN์„ ์‚ฌ์šฉํ•œ Language Model

RNNLM (RNN์„ ์‚ฌ์šฉํ•œ Language (์–ธ์–ด) ๋ชจ๋ธ)์ด๋ฒˆ์—๋Š” RNN์„ ์‚ฌ์šฉํ•˜์—ฌ Language Model(์–ธ์–ด ๋ชจ๋ธ)์„ ๊ตฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.๊ทธ ์ „์— ๋จผ์ € ์‚ฌ์šฉ๋˜๋Š” Neural Network(์‹ ๊ฒฝ๋ง)์„ ํ•œ๋ฒˆ ๋ณด๊ณ  ์‹œ์ž‘ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.์™ผ์ชฝ์€ RNNLM์˜ ๊ณ„์ธต ๊ตฌ์„ฑ์ด๊ณ , ์˜ค๋ฅธ์ชฝ์—๋Š” ์ด๋ฅผ ์‹œ๊ฐ„์ถ•์œผ๋กœ ํŽผ์นœ Neural Network(์‹ ๊ฒฝ๋ง)์ž…๋‹ˆ๋‹ค.๊ทธ๋ฆผ์˜ Embedding Layer(๊ณ„์ธต)์€ ๋‹จ์–ด ID์˜ ๋ถ„์‚ฐ ํ‘œํ˜„ (๋‹จ์–ด Vector)๋กœ ๋ณ€ํ™˜๋ฉ๋‹ˆ๋‹ค.๊ทธ๋ฆฌ๊ณ  ๊ทธ ๋ถ„์‚ฐ ํ‘œํ˜„์ด RNN Layer(RNN ๊ณ„์ธต)๋กœ ์ž…๋ ฅ๋ฉ๋‹ˆ๋‹ค.RNN ๊ณ„์ธต์€ Hidden State(์€๋‹‰ ์ƒํƒœ)๋ฅผ ๋‹ค์Œ Layer(์ธต)์œผ๋กœ ์ถœ๋ ฅํ•จ๊ณผ ๋™์‹œ์—, ๋‹ค์Œ ์‹œ๊ฐ์˜ RNN ๊ณ„์ธต(์˜ค๋ฅธ์ชฝ)์œผ๋กœ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค.๊ทธ๋ฆฌ๊ณ  RNN ๊ณ„์ธต์ด ์œ„๋กœ ์ถœ๋ ฅํ•œ Hidden State(..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] BPTT (Backpropagation Through Time)

BPTT (Backpropagation Through Time)BPTT(Backpropagation Through Time)๋Š” ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN, Recurrent Neural Network)์˜ ํ•™์Šต์„ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” Backpropagation(์—ญ์ „ํŒŒ) ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํ™•์žฅ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค.์—ฌ๊ธฐ์„œ์˜ Backpropagation(์˜ค์ฐจ์—ญ์ „ํŒŒ๋ฒ•)์€?'์‹œ๊ฐ„ ๋ฐฉํ–ฅ์œผ๋กœ ํŽผ์นœ ์‹ ๊ฒฝ๋ง์˜ ์˜ค์ฐจ์—ญ์ „ํŒŒ๋ฒ•' ์ด๋ž€ ๋œป์œผ๋กœ BPTT(Backpropagation Through Time)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.์ด BPTT๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด RNN์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. RNN์— ๊ด€ํ•œ ๊ฐœ๋…์€ ์•„๋ž˜์˜ ๊ธ€์— ์ ์–ด๋†“์•˜์œผ๋‹ˆ ์ฐธ๊ณ ํ•ด ์ฃผ์‹œ๋ฉด ๋ ๊ฑฐ ๊ฐ™์Šต๋‹ˆ๋‹ค. [DL] RNN (Recurrent Netural Network) - ์ˆœํ™˜์‹ ๊ฒฝ๋ง1. RNN ์ด๋ž€?RNN์€ Sequ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] ์ถ”๋ก  ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ• & Neural Network (์‹ ๊ฒฝ๋ง)

์ด๋ฒˆ ๊ธ€์—์„œ๋Š” ์ถ”๋ก  ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•๊ณผ Neural Network(์‹ ๊ฒฝ๋ง)์— ๋ฐํ•˜์—ฌ ํ•œ๋ฒˆ ์•Œ์•„ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•์˜ ๋ฌธ์ œ์ ๋‹จ์–ด๋ฅผ Vector๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ตœ๊ทผ์—๋Š” ํฌ๊ฒŒ ๋‘ ๋ถ€๋ฅ˜๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 'ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•'๊ณผ '์ถ”๋ก  ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•' ์ž…๋‹ˆ๋‹ค.๋‘ ๋ฐฉ๋ฒ•์ด ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ์–ป๋Š” ๋ฐฉ์‹์€ ์„œ๋กœ ๋‹ค๋ฅด์ง€๋งŒ, ๊ทธ ๋ฐฐ๊ฒฝ์—๋Š” ๋ชจ๋‘ ๋ถ„ํฌ ๊ฐ€์„ค์ด ์žˆ์Šต๋‹ˆ๋‹ค.ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•์—์„œ๋Š” ์ฃผ๋ณ€ ๋ฐ˜์–ด์˜ ๋นˆ๋„๋ฅผ ๊ธฐ์ดˆ๋กœ ๋‹จ์–ด๋ฅผ ํ‘œํ˜„ ํ–ˆ์Šต๋‹ˆ๋‹ค.๊ตฌ์ฒด์ ์œผ๋กœ๋Š” ๋‹จ์–ด์˜ Co-Occurance Matrix(๋™์‹œ ๋ฐœ์ƒ ํ–‰๋ ฌ)์„ ๋งŒ๋“ค๊ณ  ๊ทธ ํ–‰๋ ฌ์— ํŠน์ž‡๊ฐ’๋ถ„ํ•ด(Singular Value Decomposition, SVD)๋ฅผ ์ ์šฉํ•˜์—ฌ ๋ฐ€์ง‘๋ฒกํ„ฐ๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค.๊ทธ๋Ÿฌ๋‚˜, ์ด ๋ฐฉ์‹์€ ๋Œ€๊ทœ๋ชจ Corpus(๋ง๋ญ‰์น˜)๋ฅผ ๋‹ค๋ฃฐ ๋•Œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ผ๋‹จ, ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ• ๊ฐœ์„ ํ•˜๊ธฐ

์•ž์— ๊ธ€, Thesaurus(์‹œ์†Œ๋Ÿฌ์Šค), Co-occurence Matrix(๋™์‹œ๋ฐœ์ƒ ํ–‰๋ ฌ)๋ถ€๋ถ„์—์„œ ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•์— ๋ฐํ•˜์—ฌ ์„ค๋ช…ํ–ˆ์Šต๋‹ˆ๋‹ค.Thesaurus(์‹œ์†Œ๋Ÿฌ์Šค), Co-occurence Matrix(๋™์‹œ๋ฐœ์ƒ ํ–‰๋ ฌ) ๊ธ€์ž…๋‹ˆ๋‹ค. ์ง€๊ธˆ ๋‚ด์šฉ๊ณผ ์—ฐ๊ฒฐ๋˜๋Š” ๊ธ€์ด๋‹ˆ๊นŒ ํ•œ๋ฒˆ ์ฝ์–ด๋ณด์„ธ์š”. [NLP] Thesaurus(์‹œ์†Œ๋Ÿฌ์Šค), Co-occurence Matrix(๋™์‹œ๋ฐœ์ƒ ํ–‰๋ ฌ)์˜ค๋žœ๋งŒ์— NLP ๊ด€๋ จ ๊ธ€์„ ์“ฐ๋„ค์š”.. ์‹œ๊ฐ„ ๋‚˜๋Š”๋Œ€๋กœ ์—ด์‹ฌํžˆ ์“ฐ๊ณ  ์˜ฌ๋ ค ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. Thesaursus - ์‹œ์†Œ๋Ÿฌ์Šค์‹œ์†Œ๋Ÿฌ์Šค(Thesaurus)๋Š” ๋‹จ์–ด์™€ ๊ทธ ์˜๋ฏธ๋ฅผ ์—ฐ๊ฒฐ์‹œ์ผœ์ฃผ๋Š” ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.์ฃผ๋กœ ํŠน์ • ๋‹จ์–ด์™€ ์˜๋ฏธdaehyun-bigbread.tistory.com Pointwise Mutual Information (PMI) - ์ ๋ณ„ ์ƒํ˜ธ์ •..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Thesaurus(์‹œ์†Œ๋Ÿฌ์Šค), Co-occurence Matrix(๋™์‹œ๋ฐœ์ƒ ํ–‰๋ ฌ)

์˜ค๋žœ๋งŒ์— NLP ๊ด€๋ จ ๊ธ€์„ ์“ฐ๋„ค์š”.. ์‹œ๊ฐ„ ๋‚˜๋Š”๋Œ€๋กœ ์—ด์‹ฌํžˆ ์“ฐ๊ณ  ์˜ฌ๋ ค ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. Thesaursus - ์‹œ์†Œ๋Ÿฌ์Šค์‹œ์†Œ๋Ÿฌ์Šค(Thesaurus)๋Š” ๋‹จ์–ด์™€ ๊ทธ ์˜๋ฏธ๋ฅผ ์—ฐ๊ฒฐ์‹œ์ผœ์ฃผ๋Š” ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.์ฃผ๋กœ ํŠน์ • ๋‹จ์–ด์™€ ์˜๋ฏธ์ ์œผ๋กœ ์œ ์‚ฌํ•œ ๋‹จ์–ด(๋™์˜์–ด)์™€ ๋ฐ˜๋Œ€ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„ ๋‹จ์–ด(๋ฐ˜์˜์–ด)๋ฅผ ์ œ๊ณตํ•˜์—ฌ, ๊ธ€์„ ์“ฐ๊ฑฐ๋‚˜ ๋ง์„ ํ•  ๋•Œ ๋‹ค์–‘ํ•œ ํ‘œํ˜„์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.๋‹ค๋ฅธ ์˜๋ฏธ๋กœ ๋งํ•˜๋ฉด, ์œ ์˜์–ด ์‚ฌ์ „์œผ๋กœ '๋œป์ด ๊ฐ™์€ ๋‹จ์–ด(๋™์˜์–ด)'๋‚˜ '๋œป์ด ๋น„์Šทํ•œ ๋‹จ์–ด(์œ ์˜์–ด)'๊ฐ€ ํ•œ ๊ทธ๋ฃน์œผ๋กœ ๋ถ„๋ฅ˜๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.๋˜ํ•œ NLP์—์„œ ์ด์šฉ๋˜๋Š” ์‹œ์†Œ๋Ÿฌ์Šค์—์„œ๋Š” ๋‹จ์–ด ์‚ฌ์ด์˜ '์ƒ์œ„, ํ•˜์œ„' ํ˜น์€ '์ „์ฒด, ๋ถ€๋ถ„'๋“ฑ ๋” ์„ธ์„ธํ•œ ๊ด€๊ณ„๊นŒ์ง€ ์ •์˜ํ•ด๋‘” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.์˜ˆ๋ฅผ ๋“ค์–ด์„œ ์•„๋ž˜์˜ ๊ทธ๋ž˜ํ”„ ์ฒ˜๋Ÿผ ๊ด€๊ณ„๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.์ด์ฒ˜๋Ÿผ ๋ชจ๋“  ๋‹จ์–ด์— ๋ฐํ•œ ์œ ์˜์–ด ์ง‘ํ•ฉ์„ ๋งŒ..

๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)/๐Ÿ“• Natural Language Processing

[NLP] Transformer Model - ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ ์•Œ์•„๋ณด๊ธฐ

์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Transformer ๋ชจ๋ธ์˜ ์ „๋ฐ˜์ ์ธ Architecture ๋ฐ ๊ตฌ์„ฑ์— ๋ฐํ•˜์—ฌ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. Transformer: Attention is All You Need Transformer ๋ชจ๋ธ์€ 2017๋…„์— "Attention is All You Need"๋ผ๋Š” ๋…ผ๋ฌธ์„ ํ†ตํ•ด์„œ ์†Œ๊ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ฃผ์š”ํ•œ ํ•ต์‹ฌ ์•„์ด๋””์–ด๋Š” "Self-Attention" ์ด๋ผ๋Š” ๋งค์ปค๋‹ˆ์ฆ˜์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ, ๋ฌธ์žฅ ๋‚ด์˜ ๋ชจ๋“  ๋‹จ์–ด๋“ค ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ํ•œ ๋ฒˆ์— ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ „์˜ ์„ค๋ช…ํ–ˆ๋˜ RNN(Recurrent Neural Network), LSTM(Long Short-Term Memory)๊ณผ ๊ฐ™์€ ์ˆœ์ฐจ์ ์ธ Model์ด ๊ฐ€์ง„ ์ˆœ์ฐจ์  ์ฒ˜๋ฆฌ์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ–ˆ๋‹ค๋Š” ํŠน์ง•์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ Transformer ๋ชจ๋ธ..

Bigbread1129
'๐Ÿ“ NLP (์ž์—ฐ์–ด์ฒ˜๋ฆฌ)' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก