๐Ÿ—จ๏ธ Linguistic_Engineering

๐Ÿ—จ๏ธ Linguistic_Engineering

[Words] Word Tokenization - Morphemes (ํ˜•ํƒœ์†Œ)

Word Tokenization - Morphemes Word-based tokenization - ์‚ฌ๋žŒ์ด ์“ฐ๋Š” ๋‹จ์–ด์˜ ์˜๋ฏธ ํฐ ์‚ฌ์ „์ด ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์‚ฌ์ „์— ์—†๋Š” ๋‹จ์–ด๊ฐ€ ์žˆ์œผ๋ฉด ์ฒ˜๋ฆฌ ๋ถˆ๊ฐ€ → ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ์‚ฌ์ „์ด ์—„์ฒญ ์ปค์•ผํ•ด! ๋ณด์ด์ง€ ์•Š๋Š” ๋‹จ์–ด๋‚˜ ํฌ๊ท€ํ•œ ๋‹จ์–ด๋ฅผ ์ž˜ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์—†์Œ ํ•ด๊ฒฐ์ฑ… → subword tokenization Subword tokenization ๋ณดํ†ต ๋ง๋ญ‰์น˜ ์— ์ž์ฃผ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋“ค์˜ ์ง‘ํ•ฉ, ๋นˆ๋„๊ฐ€ ๋‚ฎ์€ ๋‹จ์–ด๋Š” ์–ดํœ˜๊ฐ€ ๋ถ€์กฑํ•  ์ˆ˜๋„ ๋‹จ์–ด๋ณด๋‹ค ๋” ์ชผ๊ฐœ. ๊ทธ๋ ‡๋‹ค๊ณ  ๋‹จ์–ด or ๊ธ€์ž๋„ ์•„๋‹˜ , ๊ทธ ์ค‘๊ฐ„์—์„œ ์ž๋ฅธ๋‹ค. ๋นˆ๋„๊ฐ€ ๋‚ฎ์€๊ฑด ์ตœ๋Œ€ํ•œ ์ž๋ฅด๊ณ  ์‹ถ์€ ์š•๊ตฌ์— ์˜ํ•˜์—ฌ ๋งŒ๋“ค์–ด์ง ๋ณธ์  ์—†๋Š” ๋‹จ์–ด, ํ”ํ•˜์ง€ ์•Š์€ ๋‹จ์–ด ๊ธฐ์กด์˜ NLP๋Š” ๊ณ ์ •๋œ ์–ดํœ˜๋กœ ์ž‘๋™ → ๊ทธ ๋ฐ–์— ์žˆ๋Š” ๋ชจ๋“  ํ† ํฐ์€ UNK(์•Œ์ˆ˜์—†์Œ)์œผ๋กœ ์ถ•์†Œ ..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Words] ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ & Other Morphological Processes

ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ํ•œ๊ตญ์–ด๋Š” ๊ต์ฐฉ์–ด / ์–ด๊ทผ์— ์ ‘์‚ฌ๊ฐ€ ๋ถ™์–ด์„œ ๋ฌธ๋ฒ•์ด ๊ฒฐ์ • ์–ด๊ทผ ๋‹จ์–ด๋ฅผ ๋ถ„์„ํ•  ๋•Œ ์‹ค์งˆ์  ์˜๋ฏธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ค‘์‹ฌ ๋ถ€๋ถ„ ex) ์–ด๋ฅธ์Šค๋Ÿฝ๋‹ค-> ์–ด๋ฅธ ์ ‘์‚ฌ ๋‹ค๋ฅธ ์–ด๊ทผ์— ๋ถ™์–ด ์ƒˆ๋กœ์šด ๋‹จ์–ด๋ฅผ ๊ตฌ์„ฑ ์ ‘๋‘์‚ฌ, ์ ‘๋ฏธ์‚ฌ ex) ๋งจ์†, ์„ ์ƒ๋‹˜ ์กฐ์‚ฌ ๋ฌธ๋ฒ•์ , ๊ด€๊ณ„์  ๋œป์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋‹จ์–ด ex) ์ฒ ์ˆ˜๊ฐ€ ๋ฐฅ์„ ์–ด๋ฏธ ํ™œ์šฉํ•˜์—ฌ ๋ณ€ํ•˜๋Š” ๋ถ€๋ถ„ ์„ ์–ด๋ง ์–ด๋ฏธ, ์–ด๋ง ์–ด๋ฏธ ex) ๋จน๋Š” ๋‹ค, ๋ถ„์„ํ•˜๊ฒ  ์Šต๋‹ˆ๋‹ค. ex) ์–ด๋จธ๋‹ˆ๊ฐ€ ์ฑ…์„ ์ฝ์œผ์…จ๊ฒ ๋„ค์š” ์–ด๋จธ๋‹ˆ ๊ฐ€ ์ฑ… ์„ ์ฝ ์œผ์‹œ ์—ˆ ๊ฒ  ๋„ค์š” ๋ช‡ ๊ฐœ์˜ ๋ฌธ์žฅ์„ ํ†ตํ•ด ํ˜•ํƒœ์†Œ ๋ถ„์„์„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ช‡[๊ด€ํ˜•์‚ฌ] / ๊ฐœ[๋ช…์‚ฌ] / ์˜[์กฐ์‚ฌ] / ๋ฌธ์žฅ[๋ช…์‚ฌ] / ์„[ํ†ต[์–ด๊ทผ] / ํ•ด[ํ•˜[์ ‘๋ฏธ์‚ฌ] / ์•ผ[์—ฌ๋ง์–ด๋ฏธ]]/ ๊ฒ [์„ ์–ด๋ง์–ด๋ฏธ] / ์Šต๋‹ˆ๋‹ค[์–ด๋ง์–ด๋ฏธ] ํ˜•ํƒœ[๋ช…์‚ฌ]/ ์†Œ[๋ช…์‚ฌ] / ๋ถ„์„[๋ช…์‚ฌ] / ์„[์กฐ์‚ฌ..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Semantics & Pragmatics] Thematic Roles - ์˜๋ฏธ์—ญ

Thematic Roles (์˜๋ฏธ์—ญ) Thematic [ฦŸ] roles (์˜๋ฏธ์—ญ) : ๋™์‚ฌ์˜ ์ธ์ˆ˜์™€ ๋™์‚ฌ๊ฐ€ ์„ค๋ช…ํ•˜๋Š” ์ƒํ™ฉ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ํ‘œํ˜„ Agent: the ‘doer’ of the action ์–ด๋–ค ํ–‰๋™์˜ ‘์‹คํ–‰์ž’ Theme: the ‘undergoer’ of the action ํ–‰๋™์˜ ‘๋ฐœ๋‹จ’ Goal: the endpoint of a change in location or possession ์œ„์น˜ & ์†Œ์œ ๊ถŒ ๋ณ€๊ฒฝ์˜ ๋ Source: where the action originates ๋™์ž‘์ด ๋ฐœ์ƒํ•˜๋Š”๊ณณ Instrument: the means used to accomplish an action ์–ด๋–ค ์ˆ˜๋‹จ์„ ๊ฐ€์ง€๊ณ  ์™„์„ฑํ•œ๊ฑฐ - key ๊ฐ™์€ ๊ฐœ๋… ์–ด๋–ค ํ–‰๋™์„ ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” ์ˆ˜๋‹จ Experience..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Semantics & Pragmatics] Lexical Semantics - ์–ดํœ˜ ์˜๋ฏธ๋ก 

Lexical Semantics: Reference & Sense Referent (์ง€์‹œ ๋Œ€์ƒ): ๋‹จ์–ด๋กœ ์ง€์ •๋œ ์‹ค์ œ ์‚ฌ๋ฌผํ•œ ๋‹จ์–ด๊ฐ€ ์–ด๋–ค ๊ฐ€๋ฆฌํ‚ค๋Š” ๋Œ€์ƒ์ด ์žˆ๋Š” ๊ฒƒ. ๐Ÿ’ก Example Jack, the happy swimmer, my friend, and that guy can all have the same referent in the sentence Jack swims. -> Jack = the happy swimmer = my friend = that guy ๊ฐ„๋‹จํ•ด ๋ณด์ด์ง€๋งŒ, ๋˜‘๊ฐ™์€ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๊ฑธ ์ง€์นญํ•˜๋Š” ๊ฑด ์‰ฝ์ง€๊ฐ€ ์•Š๋‹ค. ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. - ์ง€์‹œ ๋Œ€์ƒ์ด ๊ฐ™์•„๋„ ์˜๋ฏธ๊ฐ€ ๋‹ค๋ฅด๋ฉด ๊ฐ™๋‹ค๊ณ  ํ•  ์ˆ˜ ์—†๋‹ค. ๐Ÿ’ก Example Superman, born Kal-El and legally n..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Semantics & Pragmatics] The meaning of language - ์˜๋ฏธ๋ก , ์–ด์šฉ๋ก 

Semantics (& Pragmatics) Semantics (& Pragmatics) - The meaning of language When Compositionality Goes Awry: Anomaly Sentential Semantics (๋ฌธ์žฅ ์˜๋ฏธ๋ก ) ํ™”์ž๊ฐ€ ๋ฌธ์žฅ ์˜๋ฏธ์— ๋Œ€ํ•ด ์•„๋Š” ๊ฒƒ ๐Ÿ’ก Example Truth Entailment and Related Notions Ambiguity Compositional Semantics (๊ตฌ์„ฑ ์˜๋ฏธ๋ก ) When Compositionality Goes Awry ๐Ÿ’ก Example Anomaly Metaphor Idioms Lexical Semantics (Word Meanings) - ์–ดํœ˜ ์˜๋ฏธ๋ก  (๋‹จ์–ด ์˜๋ฏธ) ๐Ÿ’ก Example Theories of Wor..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Syntax] Syntactic analysis in NLP - NLP์—์„œ ๊ตฌ๋ฌธ๋ถ„์„

Syntactic analysis in NLP Parsing - PP & NP์˜ ๋ฐ˜๋ณต.. Counsituency Parsing์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•œ ๊ฒƒ์ด Dependency Parsing Counsituency Parsing Structure Tree Dependenxy Parsing Structure Tree Dependent Grammer head๊ฐ€ dependent ์ผ ๋•Œ ๋„ ์žˆ๊ณ  ์„œ๋กœ ๋ฐ˜๋Œ€์ผ ์ˆ˜๋„ ์žˆ๋‹ค. ์ข…์†์„ฑ์— ๊ธฐ๋ฐ˜ Dependency Structure๋Š” Word(head)์™€ ๊ทธ๊ฒƒ์˜ Dependent๊ณผ์˜ ๊ด€๊ณ„์— ์˜ํ•ด ๊ฒฐ์ •๋œ๋‹ค. ์˜๋ฏธ์ ์œผ๋กœ ๊ด€๊ณ„๊ฐ€ ์žˆ๋Š”๊ฒƒ ๋“ค๋งŒ ์—ฐ๊ฒฐ๋œ๋‹ค. - ์˜๋ฏธ์ ์œผ๋กœ๋งŒ ์—ฐ๊ฒฐ๋˜๋ฉด ๋ฌถ์„์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๋น„๊ต์  ์ž์œ ๋กœ์šด๊ฒƒ์ด ํŠน์ง• ์ž์œ  ์–ด์ˆœ(Free word order)์˜ ์–ธ์–ด ๋ถ„์„์— ๋งค์šฐ ์ ํ•ฉ P..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Syntax] Sentence Structure - ๋ฌธ์žฅ ๊ตฌ์กฐ

Sentence Sentence Structure “The child found the puppy” ๋ฌธ์žฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ…œํ”Œ๋ฆฟ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๋‹ค๊ณ  ๋งํ•  ์ˆ˜ ์žˆ๋‹ค. Det—N—V—Det—N ์ด๊ฒƒ์€ ๋ฌธ์žฅ์ด ๋‚ด๋ถ€ ๊ตฌ์กฐ๊ฐ€ ์—†๋Š” ๋‹จ์–ด์˜ ๋ฌธ์ž์—ด์— ๋ถˆ๊ณผํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธ ๋ฌธ์žฅ์€ ํ•œ์ธต ์งœ๋ฆฌ ๋šœ๋ ทํ•œ ๊ตฌ์กฐ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์ง€ ์•Š๊ณ , ๊ณ„์ธต์  ๊ตฌ์กฐ ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ์ด ๋ฌธ์žฅ์€ ์‹ค์ œ๋กœ ์—ฌ๋Ÿฌ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค ์–ด๋–ป๊ฒŒ ๊ฒฐํ•ฉํ•˜๋Š” ์ง€์— ๋”ฐ๋ผ ๋ง์˜ ์˜๋ฏธ๊ฐ€ ๋‹ฌ๋ผ์ง„๋‹ค. ๐Ÿ’ก example [the child] [found a puppy] [the child] [found [a puppy]] [[the] [child]] [[found] [[a] [puppy]] ํŠธ๋ฆฌ ๋‹ค์ด์–ด๊ทธ๋žจ์€ ๋ฌธ์žฅ์˜ ๊ณ„์ธต ๊ตฌ์กฐ๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค. Syntactic Ca..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Syntax] Syntax Intro - ๊ตฌ๋ฌธ

Syntax - ๋ฌธ์žฅ์˜ pattern ์—ฐ๊ตฌ (๋ฌธ๋ฒ•) Syntax ๋ชจ๋“  ์ธ๊ฐ„ ์–ธ์–ด๋ฅผ ๊ตฌ์‚ฌํ•˜๋Š” ์‚ฌ๋žŒ์€ ๋ฌดํ•œํ•œ ์ˆ˜์˜ ๊ฐ€๋Šฅํ•œ ๋ฌธ์žฅ์„ ์ƒ์‚ฐํ•˜๊ณ  ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ๋ฌธ์žฅ๋“ค์— ๋Œ€ํ•œ mental dictionary์„ ๊ฐ€์งˆ ์ˆ˜ ์—†๋‹ค. ์˜คํžˆ๋ ค, ์šฐ๋ฆฌ๋Š” ์šฐ๋ฆฌ์˜ ๋‡Œ์— ์ €์žฅ๋œ ๋ฌธ์žฅ์„ ํ˜•์„ฑํ•˜๊ธฐ ์œ„ํ•œ ๊ทœ์น™์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. What Grammaticality Is Not Based On ๋ฌธ๋ฒ•์„ฑ ์€ ์˜๋ฏธ & ์ง„์‹ค์„ฑ์— ๊ธฐ์ดˆ ํ•˜์ง€ ์•Š๋Š”๋‹ค. ๐Ÿ’ก example Enormous crickets in pink socks danced at the prom. ๋ฌด๋„ํšŒ์—์„œ ๋ถ„ํ™์ƒ‰ ์–‘๋ง์„ ์‹ ์€ ๊ฑฐ๋Œ€ํ•œ ๊ท€๋šœ๋ผ๋ฏธ๊ฐ€ ์ถค์„ ์ถ”์—ˆ๋‹ค. ๊ท€๋šœ๋ผ๋ฏธ๊ฐ€ ์ถค์„ ์ถ”์—ˆ๋‹ค๋Š”๊ฑด → ๋ง์ด ๋˜์ง€ ์•Š์Œ. ๋ฌธ๋ฒ•์€ ๋งž์•„๋„ ๋ง์ด ๋˜์ง€ ์•Š๋Š”๋‹ค. → ๊ทธ๋ ‡์ง€๋งŒ Di..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Words] Words - ๋‹จ์–ด

The Words of Language ๋‹จ์–ด๋Š” ์–ธ์–ด์  ์ง€์‹์˜ ์ค‘์š”ํ•œ ๋ถ€๋ถ„ & ๋ฌธ๋ฒ•์˜ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๊ตฌ์„ฑ ์šฐ๋ฆฌ๊ฐ€ ์•„๋Š” ๋ชจ๋“  ๋‹จ์–ด๋Š” mental dictionary๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. Pronunciation (๋ฐœ์Œ) Meaning (์˜๋ฏธ) Orthography (Spelling) - ๋งž์ถค๋ฒ• Grammatial Category (๋ฌธ๋ฒ• ๋ฒ”์ฃผ) Morphology (ํ˜•ํƒœ์†Œ) ํ˜•ํƒœ์†Œ ๋ผ๊ณ  ํ•˜๋Š” ์ž‘์€ ๋‹จ์œ„๋กœ ๋ณธ๋‹ค. ์œ ํ•œํ•œ ๋ฐ์ดํ„ฐ์—์„œ ๋งŒ๋“ค์–ด ๋‚ด์ง€๋งŒ ์œ ํ•œํ•œ ๊ทœ์น™์ด ์žˆ๋‹ค. example) ๋‚˜๋Š” ํ•™๊ต์— ๊ฐ„๋‹ค, ํ•˜๋Š˜์„ ๋‚˜๋Š” ์ƒˆ ์‹ค์ œ๋กœ Morphological Persingํ•˜๋ฉด ๋˜‘๊ฐ™์€ ๋‚˜๋Š” ์ด์ง€๋งŒ ๋‚˜ → ๋Œ€๋ช…์‚ฌ, ์กฐ์‚ฌ ์ƒ๊ฐ๋งŒ ๋‚œ๋‹ค. Normalization (ํ‘œ์ค€ํ™”) ๋ฌธ์žฅ์„ ๋‚˜๋ˆ„๋ ค๋ฉด ๋ฌธ์žฅ๋ถ€ํ˜ธ ๊ฐ€ ์žˆ์–ด์•ผ ํ•œ๋‹ค, ๋ถ€ํ˜ธ ์—†์–ด๋„ ๋‚˜๋ˆ ์•ผ ..

๐Ÿ—จ๏ธ Linguistic_Engineering

[Intro] Introduction to Language Engineering - ์–ธ์–ด๊ณตํ•™๊ฐœ๋ก 

1. ์–ธ์–ด๊ณตํ•™๊ฐœ๋ก  ์„œ๋ก  1) ๋ฌธ๋งฅ์˜ ์ค‘์š”์„ฑ ํ•œ๊ตญ์–ด๋Š” ํ•œ๊ตญ์–ด์˜ ํŠน์ง•, ์˜์–ด๋Š” ์˜์–ด์˜ ํŠน์ง•์ด ์žˆ๋‹ค. ํ•œ๊ตญ์–ด๋ฅผ ์˜์–ด๋กœ ํ•ด์„ํ•œ๋‹ค๊ณ  ํ•ด์„œ 1:1๋กœ ๋Œ€์‘ํ•˜์—ฌ ๋‹จ์–ด ํ•˜๋‚˜ํ•˜๋‚˜๋ฅผ ํ•ด์„ํ•˜์ง€๋Š” ์•Š์„ ๊ฒƒ์ด๋‹ค. ๋ฌธ๋งฅ์˜ ์ค‘์š”์„ฑ ์ƒํ™ฉ์ด ๋‹ฌ๋ผ์ง€๋ฉด T, F๊ฐ€ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋“ฏ์ด ์–ด๋– ํ•œ ๋ช…์ œ๋„ ์ฐธ์ด๋ผ๊ณ  ํ•˜๊ธฐ์—” ์–ด๋ ค์šฐ๋‚˜ ๋ณดํ†ต ๊ทธ๋Ÿฐ ๋ช…์ œ๋“ค์€ ‘์ด์‹œ๋Œ€ ๋ชจ๋“  ์‚ฌ๋žŒ๋“ค์ด ์ƒ๊ฐํ•˜๋Š” ๊ณตํ†ต์ ’๊ณผ ๊ฐ™์€ ๋งฅ๋ฝ์—์„œ ์ƒ๊ฐํ•ด์•ผํ•œ๋‹ค. ์ƒ๊ฐํ•ด์•ผํ•  ์‚ฌํ•ญ ์ธ๊ฐ„์ด ์ดํ•ดํ•˜๊ธฐ์—” ๋‹น์—ฐํ•˜๊ณ  ์‰ฌ์šด ์ผ์ด์ง€๋งŒ, ์ธ๊ฐ„์ด ์–ธ์–ด๋ฅผ ์ดํ•ดํ•  ๋•Œ ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ๋‹น์—ฐํ•˜๋‹ค๊ณ  ๋Š๋ผ๋Š” ๊ฒƒ์ด ์ปดํ“จํ„ฐ์—๊ฒ ์–ด๋ ต๋‹ค. ๋”ฅ๋Ÿฌ๋‹์— ์ ์šฉํ•˜๊ธฐ์—” ์–ด๋ ค์šด ์ธ๊ฐ„์˜ ๊ทœ์น™์ด, ๋”ฅ๋Ÿฌ๋‹์„ ํ™œ์šฉํ•จ์œผ๋กœ์จ ์˜คํžˆ๋ ค ๋” ์‰ฝ๊ฒŒ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ๋„ ํ•˜๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์–ด๋–ป๊ฒŒ ์–ธ์–ด๋ฅผ ์ดํ•ดํ•˜๋Š”์ง€? (์ปดํ“จํ„ฐ์™€ ๋‹ค๋ฅธ ์ ) ์ปดํ“จํ„ฐ๊ฐ€ ์ธ๊ฐ„์˜ ์–ธ์–ด๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ฐฉํ–ฅ..

Bigbread1129
'๐Ÿ—จ๏ธ Linguistic_Engineering' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก