[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] GPT Understands, Too
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Liu, Xiao, et al. "GPT understands, too." arXiv preprint arXiv:2103.10385 (2021). NLU Task์—์„œ ์•ฝํ•จ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋˜ GPT๊ฐ€ BERT ๊ณ„์—ด์„ ์ด๊ธด P-Tuning์— ๋Œ€ํ•œ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ทธ๋™์•ˆ์˜ ์ž์—ฐ์–ด ์ดํ•ด ์—ฐ๊ตฌ ํ๋ฆ„์— ๋Œ€ํ•ด์„œ๋„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] FEDNLP, Federated Distillation of Natural Language Understanding with Confident Sinkhorns
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Lin, Bill Yuchen, et al. "Fednlp: Benchmarking federated learning methods for natural language processing tasks." arXiv preprint arXiv:2104.08815 (2021). Abstract ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) ์ž‘์—…์„ ์œ„ํ•œ ๊ฐœ์ธ ์ •๋ณด ๋ณดํ˜ธ, ๋ถ„์‚ฐํ˜• ํ•™์Šต ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š” NLP ์ž‘์—…์— ๋Œ€ํ•œ FL ๋ฐฉ๋ฒ• ์—ฐ๊ตฌ์— ๋Œ€ํ•œ ๊ด€์‹ฌ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๋ฌธํ—Œ์—๋Š” ์ฒด๊ณ„์ ์ธ ๋น„๊ต ๋ฐ ๋ถ„์„์ด ๋ถ€์กฑ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜, ์‹œํ€€์Šค ํƒœ๊น…, ์งˆ๋ฌธ ๋‹ต๋ณ€ ๋ฐ seq2seq์˜ ๋„ค ๊ฐ€์ง€ ์ž‘์—… ๊ณต์‹์— ๋Œ€ํ•œ ์—ฐํ•ฉ ํ•™์Šต ๋ฐฉ๋ฒ•์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๋ฒค์น˜๋งˆํ‚น ํ”„๋ ˆ์ž„์›Œํฌ์ธ FedNLP๋ฅผ ์ œ์‹œ Motivate FL ์˜์—ญ์˜ ๋ฐœ์ „์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  NLP์— ๋Œ€ํ•œ ์—ฐ๊ตฌ์™€ ์ ์šฉ..
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Federated split bert for heterogeneous text classification
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Federated split bert for heterogeneous text classification Lit, Zhengyang, et al. "Federated split bert for heterogeneous text classification." 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. Abstract ์—ฐํ•ฉ ํ•™์Šต ์„ค์ •์—์„œ BERT๋ฅผ ์กฐ์‚ฌํ•œ ๋ช‡ ๊ฐ€์ง€ ์—ฐ๊ตฌ๊ฐ€ ์žˆ์ง€๋งŒ ํด๋ผ์ด์–ธํŠธ์— ๋Œ€ํ•œ ์ด๊ธฐ์ข…(์˜ˆ: ๋น„ IID) ๋ฐ์ดํ„ฐ๋กœ ์ธํ•œ ์„ฑ๋Šฅ ์†์‹ค ๋ฌธ์ œ๋Š” ์•„์ง ์ œ๋Œ€๋กœ ์กฐ์‚ฌ๋˜์ง€ ์•Š์Œ. ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด BERT ์ธ์ฝ”๋” ๊ณ„์ธต์„ ๋กœ์ปฌ ๋ถ€๋ถ„๊ณผ ์ „์—ญ ๋ถ€๋ถ„์œผ๋กœ ๋ถ„ํ• ํ•˜์—ฌ ์ด์ข… ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ํ†ต์‹  ๋น„์šฉ์„ ์ค„์ด๋Š” ํ”„๋ ˆ์ž„์›Œํฌ์ธ..
To be uploaded Papers list
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Liu, Xiao, et al. "P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks." Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) . 2022. P-tuning Ver.2์— ๊ด€ํ•œ ๋…ผ๋ฌธ์ด๊ณ , ๋ชจ๋ธ scale์ด๋‚˜ NLU Task์— ๊ด€๋ จ์—†์ด ์ตœ์ ํ™”๋œ Prompt ๊ตฌ์กฐ ์ œ์‹œ Masked Language Modeling ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ Classification์—์„œ ์ ์šฉ ๊ฐ€๋Šฅํ•ด์ง Deep Prompt Tuning ๊ตฌ์กฐ ์ฑ„ํƒ ๋ชจ๋“  ๋ ˆ์ด์–ด์— Continuous promp..
Recent Natural Language Paper Flows
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ž์—ฐ์–ด ๋…ผ๋ฌธ์˜ ํ๋ฆ„๋„ ์ด์ „๊นŒ์ง€ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ๋Š” ๊ธฐ๊ณ„ํ•™์Šต์—์„œ RNN์˜ ๋“ฑ์žฅ์œผ๋กœ์ธํ•ด ์ ์ฐจ ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ ๋“ค์–ด๊ฐ€๊ฒŒ ๋˜์—ˆ์Œ. RNN ์ˆœํ™˜์‹ ๊ฒฝ๋ง์—์„œ vanishing gradient ๋ฌธ์ œ๊ฐ€ ์žˆ์–ด์„œ, ๊ธฐ์–ตshell์„ ์ถ”๊ฐ€ํ•œ LSTM์ด ๋‚˜์™”์œผ๋ฉฐ, ๊ทธ๋’ค๋กœ GRU๋„ ์“ฐ๊ณ  ํ•˜๋‹ค๊ฐ€ ํŠธ๋žœ์Šคํฌ๋จธ๊ฐ€ ๋‚˜์˜จ ๋’ค๋กœ๋ถ€ํ„ฐ ์–˜๊ฐ€ ์ด์ „๊นŒ์ง€์˜ ์„ฑ๋Šฅ์„ ๋‹ค ์ด๊ฒจ๋ฒ„๋ ค์„œ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์˜ ํ๋ฆ„์€ ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์—ฐ๊ตฌ๊ฐ€ ๋งŽ์•„์กŒ์Œ. Attention Is All You Need Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). ์–ดํ…์…˜ ๊ธฐ๋ฒ• ์†Œ๊ฐœ → ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ On. NLP ๋ถ„์•ผ์—์„œ ์ „์„ค์˜ ์‹œ์ž‘ ..
[๋…ผ๋ฌธ๊ตฌํ˜„]Self-Explaining Structures Improve NLP Models
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Sun, Zijun, et al. "Self-explaining structures improve nlp models." arXiv preprint arXiv:2012.01786 (2020). ํ˜„์žฌ ์ž์—ฐ์–ด ์ถ”๋ก ๋ถ€๋ถ„์—์„œ ์ƒ์œ„ ๋ถ€๋ถ„์„ ์ฐจ์ง€ํ•˜๊ณ  ์žˆ๋Š” ๋…ผ๋ฌธ์ด๋‹ค. ์ฒ˜์Œ Pre-print๋ผ๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ์„ ๋•Œ ๊ผฌ๋ฆ„ํ•œ๊ฑธ ์•Œ์•„์ฐจ๋ ธ์–ด์•ผํ–ˆ๋Š”๋ฐ, ์—ฌ๋Ÿฌ ๋Œ€ํšŒ์—์„œ๋„ ๋ณธ ๋…ผ๋ฌธ์˜ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•œ ๋ฐฉ๋ฒ•๋„ ๋งŽ๊ณ , ์ ์ˆ˜๋„ ๋†’๊ฒŒ๋‚˜์™€์„œ ์ข‹์€ ๋ฐฉ๋ฒ•์ด๊ตฌ๋‚˜ ํ•˜์˜€๋‹ค. ํ•™ํšŒ์— ๋“ฑ๋ก ์•ˆ๋œ๊ฑธ ๋ฌด์‹œํ•˜๋ฉด ์•ˆ๋˜์—ˆ๋‹ค.. ๋…ผ๋ฌธ ๊ทธ๋Œ€๋กœ ์ง์ ‘ ๊ตฌํ˜„ํ•ด๋ณด๋‹ˆ Base ๋ชจ๋ธ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋” ์•ˆ์ข‹๋‹ค. ์ด ๊ธฐ์ˆ  ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ๋ช‡์ฃผ๋‚ ๋ฆฐ์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค.. ๋ถ€๋“ค.. ๋‚ด๊ฐ€ ์ž˜๋ชปํ•œ๊ฑด์ง€, ๋…ผ๋ฌธ์—์„œ ๊ตฌ๋ผ์นœ๊ฑด์ง€ ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ... ๊ผฌ๋ฆ„ํ•œ์ ์ด ํ•œ๋‘˜์ด ์•„๋‹ˆ๋‹ค. ์ผ๋‹จ ์„œ๋ก ์€ ์—ฌ๊ธฐ๊นŒ์ง€ํ•˜๊ณ  Abstract 1...
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ]Are Prompt-Based Models Clueless?
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Kavumba, Pride, Ryo Takahashi, and Yasuke Oda. "Are Prompt-based Models Clueless?." arXiv preprint arXiv:2205.09295 (2022). 2022๋…„๋„ 5์›”์— ACL์—์„œ ๋‚˜์˜จ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ฃผ์ œ๋Š” Prompt-based Models๋„ Superficial Cues๋ฅผ ์‚ฌ์šฉํ•˜๋Š”์ง€, ์‚ฌ์šฉํ•œ๋‹ค๋ฉด ์•…์šฉํ•˜์—ฌ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง€๊ฒŒ ๋งŒ๋“ค์ง€๋Š” ์•Š๋Š”์ง€ ๋ถ„์„ํ•˜์—ฌ ํ™•์ธํ•˜๋Š” ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. (์ฐธ๊ณ ์ž๋ฃŒ) DiceLab SangHun Im
[DACON] ๋ฐ์ด์ฝ˜ ์‡ผํ•‘๋ชฐ ๋ฆฌ๋ทฐ ํ‰์  ๋ถ„๋ฅ˜ ๊ฒฝ์ง„๋Œ€ํšŒ
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ž์—ฐ์–ด ์ถ”๋ก ์ชฝ์„ ์—ฐ๊ตฌํ•˜๋ฉด์„œ, ์ง„ํ–‰์ค‘์ธ ์‹คํ—˜์ด ๋ง‰ํ˜€์„œ ์ž ๊น ํ™˜๊ธฐ์‹œํ‚ฌ๊ฒธ ์ž์—ฐ์–ด ๋Œ€ํšŒ๋ฅผ ์ฐพ์•„๋ดค์—ˆ๋‹ค. ์บ๊ธ€์—์„œ ์ง„ํ–‰์ค‘์ธ XNLI ๋Œ€ํšŒ, ํ•œ๊ตญ DACON์—์„œ ์ง„ํ–‰ํ•˜๋Š” ํ‰์  ๋ถ„๋ฅ˜ ๋Œ€ํšŒ ๋‘๊ฐ€์ง€๋ฅผ ์ฐพ์•„์„œ ์ด๋ฅผ ์ง„ํ–‰ํ•ด ๋ณด์•˜๋‹ค. ์ผ๋‹จ ์ด ๋Œ€ํšŒ๋ฅผ ์ฐพ์•˜์„ ๋‹น์‹œ์— ๋งˆ๊ฐ์ด D-1์ด๋ผ ๋‹ค๋ฅธ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์ง€๋Š” ๋ชปํ–ˆ๊ณ , ๋น ๋ฅด๊ฒŒ ์ œ์ถœํ•˜๊ธฐ ์œ„ํ•œ ๋ฒ ์ด์Šค๋ชจ๋ธ๋งŒ ์‚ฌ์šฉํ•˜์—ฌ ๋Œ€ํšŒ์— ์ฐธ๊ฐ€ํ•˜์˜€๋‹ค. ๋Œ€ํšŒ ์ฐธ์—ฌ ์ธ์›์€ 549๋ช… ์ •๋„์˜€๊ณ , 1๋“ฑ์˜ ์ ์ˆ˜๋Š” 0.71312 ์˜€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•˜๋ฃจ ํˆฌ์žํ•ด์„œ ์ œ์ถœํ•œ ๋‚ด ์ ์ˆ˜๋Š” 0.68888๋กœ, 1๋“ฑ๊ณผ์˜ ์ •ํ™•๋„ ์ฐจ์ด๊ฐ€ 0.02432(์•ฝ 2.4%) ์ฐจ์ด์˜€๋‹ค. ๋ฒ ์ด์ง ๋Œ€ํšŒ์ธ๋ฐ ํƒ‘10๋„ ๋ชปํ•ด์„œ ํ˜„ํƒ€๊ฐ€ ์˜ค๊ธด ํ–ˆ์ง€๋งŒ, ๊ทธ๋ž˜๋„ ๋น ๋ฅด๊ฒŒ ์ œ์ถœํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฑฐ์— ์˜์˜๋ฅผ ๋‘์—ˆ๋‹ค. ์ฒ˜์Œ์—๋Š” ์ผ๋‹จ ํ•œ๊ตญ์–ด ์ปค์Šคํ…€ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ค์–ด์„œ ์‚ฌ์šฉํ•˜๊ธฐ..
Studying natural language models from the beginning
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ž์—ฐ์–ด ๋ชจ๋ธ์„ ์ดํ•ดํ–ˆ๋Š”์ง€ ํ™•์ธ + ๊ณต๋ถ€๋ฅผ ์œ„ํ•œ ํ•„๊ธฐ ๋…ธํŠธ์ž…๋‹ˆ๋‹ค. 1. ๋‹จ์ˆœ ์‹ ๊ฒฝ๋ง 2. RNN 3. LSTM 4. GRU 5. Seq2Seq (Sequence to Sequnece) 6. Attention Mechanism 7. ๊ต์‚ฌ ํ•™์Šต (Teacher Forcing) 8. Beam Search Algorithm 9. Transformer (Encoder, Decoder) 10. BERT 11. RoBERTa 12. ALBERT 13. Embedding / Encoding 14. Knowledge Distillation 15. Self-Explaning 16. Sentence BERT
ํ•™๋ถ€ ์กธ์—… ๊ธฐ๋…, ๊ฐœ๋… ๋‹ค์‹œ ๋˜์งš์–ด๋ณด๊ธฐ
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ธ์Šคํƒ€๋ณด๋‹ค๊ฐ€ ๋ณด์ด์ €์—‘์Šค๋ผ๋Š” ํšŒ์‚ฌ์—์„œ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ด€๋ จํ•ด์„œ ๊ธฐ์ˆ ์งˆ๋ฌธ ์จ๋†จ๊ธธ๋ž˜ ์ง€๊ธˆ๊นŒ์ง€ ๊ณต๋ถ€ํ–ˆ๋˜ ๊ธฐ์–ต์„ ๋ฐ”ํƒ•์œผ๋กœ ๋ฆฌํ”„๋ ˆ์‰ฌํ• ๊ฒธ ํ•œ๋ฒˆ ํ’€์–ด๋ณด์•˜๋‹ค. ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์ธํ„ด ์งˆ๋ฌธ (ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์งˆ๋ฌธ: 15๊ฐœ) PNG์™€ JPG์˜ ์ฐจ์ด์ ์€? PNG → ๋น„์†์‹ค์••์ถ• (์›๋ณธ ํ›ผ์†X) JPG → ์†์‹ค์••์ถ• (์›๋ณธ ํ›ผ์†O) JPEG → ์‚ฌ๋žŒ์ด ๋ชจ๋ฅผ์ •๋„๋กœ๋งŒ ์›๋ณธ ํ›ผ์†(์••์ถ•ํšจ๊ณผ๊ทน๋Œ€ํ™”์•Œ๊ณ ๋ฆฌ์ฆ˜) Dynamic Programming์ด๋ž€? ๋™์ ๊ณ„ํš๋ฒ•, ํฐ๋ฌธ์ œ๋ฅผ ์ž‘์€๋ฌธ์ œ๋กœ ๋‚˜๋ˆ  ํ‘ธ๋Š”๊ฒƒ. ๋ณต์žกํ•œ ๋ฌธ์ œ ๋‚˜์˜ค๋ฉด ์—ฌ๋Ÿฌ๊ฐœ์˜ ์„œ๋ธŒ ๋ฌธ์ œ๋กœ ๋‚˜๋ˆ ์„œ ํ‘ธ๋Š”๋ฐ, ๋ถ€๋ถ„ ๋ฐ˜๋ณต ๋ฌธ์ œ์™€ ์ตœ์  ๋ถ€๋ถ„ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ’€ ๋•Œ ์‚ฌ์šฉ. ๋ถ€๋ถ„ ๋ฐ˜๋ณต ๋ฌธ์ œ→์–ด๋–ค ๋ฌธ์ œ๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ๋ถ€๋ถ„๋ฌธ์ œ๋กœ ์ชผ๊ฐœ์งˆ ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ (ex. N๋ฒˆ์งธ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜ ๊ตฌํ•˜๊ธฐ → N-1๋ฒˆ์ฉจ / N-2..
Liky
'Artificial_Intelligence๐Ÿค–/Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (2 Page)