Studying natural language models from the beginning
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ž์—ฐ์–ด ๋ชจ๋ธ์„ ์ดํ•ดํ–ˆ๋Š”์ง€ ํ™•์ธ + ๊ณต๋ถ€๋ฅผ ์œ„ํ•œ ํ•„๊ธฐ ๋…ธํŠธ์ž…๋‹ˆ๋‹ค. 1. ๋‹จ์ˆœ ์‹ ๊ฒฝ๋ง 2. RNN 3. LSTM 4. GRU 5. Seq2Seq (Sequence to Sequnece) 6. Attention Mechanism 7. ๊ต์‚ฌ ํ•™์Šต (Teacher Forcing) 8. Beam Search Algorithm 9. Transformer (Encoder, Decoder) 10. BERT 11. RoBERTa 12. ALBERT 13. Embedding / Encoding 14. Knowledge Distillation 15. Self-Explaning 16. Sentence BERT
ํ•™๋ถ€ ์กธ์—… ๊ธฐ๋…, ๊ฐœ๋… ๋‹ค์‹œ ๋˜์งš์–ด๋ณด๊ธฐ
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
์ธ์Šคํƒ€๋ณด๋‹ค๊ฐ€ ๋ณด์ด์ €์—‘์Šค๋ผ๋Š” ํšŒ์‚ฌ์—์„œ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ด€๋ จํ•ด์„œ ๊ธฐ์ˆ ์งˆ๋ฌธ ์จ๋†จ๊ธธ๋ž˜ ์ง€๊ธˆ๊นŒ์ง€ ๊ณต๋ถ€ํ–ˆ๋˜ ๊ธฐ์–ต์„ ๋ฐ”ํƒ•์œผ๋กœ ๋ฆฌํ”„๋ ˆ์‰ฌํ• ๊ฒธ ํ•œ๋ฒˆ ํ’€์–ด๋ณด์•˜๋‹ค. ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์ธํ„ด ์งˆ๋ฌธ (ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์งˆ๋ฌธ: 15๊ฐœ) PNG์™€ JPG์˜ ์ฐจ์ด์ ์€? PNG → ๋น„์†์‹ค์••์ถ• (์›๋ณธ ํ›ผ์†X) JPG → ์†์‹ค์••์ถ• (์›๋ณธ ํ›ผ์†O) JPEG → ์‚ฌ๋žŒ์ด ๋ชจ๋ฅผ์ •๋„๋กœ๋งŒ ์›๋ณธ ํ›ผ์†(์••์ถ•ํšจ๊ณผ๊ทน๋Œ€ํ™”์•Œ๊ณ ๋ฆฌ์ฆ˜) Dynamic Programming์ด๋ž€? ๋™์ ๊ณ„ํš๋ฒ•, ํฐ๋ฌธ์ œ๋ฅผ ์ž‘์€๋ฌธ์ œ๋กœ ๋‚˜๋ˆ  ํ‘ธ๋Š”๊ฒƒ. ๋ณต์žกํ•œ ๋ฌธ์ œ ๋‚˜์˜ค๋ฉด ์—ฌ๋Ÿฌ๊ฐœ์˜ ์„œ๋ธŒ ๋ฌธ์ œ๋กœ ๋‚˜๋ˆ ์„œ ํ‘ธ๋Š”๋ฐ, ๋ถ€๋ถ„ ๋ฐ˜๋ณต ๋ฌธ์ œ์™€ ์ตœ์  ๋ถ€๋ถ„ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ’€ ๋•Œ ์‚ฌ์šฉ. ๋ถ€๋ถ„ ๋ฐ˜๋ณต ๋ฌธ์ œ→์–ด๋–ค ๋ฌธ์ œ๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ๋ถ€๋ถ„๋ฌธ์ œ๋กœ ์ชผ๊ฐœ์งˆ ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ (ex. N๋ฒˆ์งธ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜ ๊ตฌํ•˜๊ธฐ → N-1๋ฒˆ์ฉจ / N-2..
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
Schick, Timo, and Hinrich Schütze. "Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference." Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. Schick, Timo, and Hinrich Schütze. "It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners." Proceedings of the 20..
Text Similarity, Semantic Similarity
ยท
Artificial_Intelligence๐Ÿค–/Natural Language Processing
ํ…์ŠคํŠธ ์œ ์‚ฌ๋„ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ (Cosine Similarity) -> ๋‘ ๊ฐœ์˜ ๋ฒกํ„ฐ ๊ฐ’์˜ Cos ๊ฐ๋„ ์œ ํด๋ฆฌ๋””์–ธ ์œ ์‚ฌ๋„ (Euclidean Similarity) -> ๋‘ ๊ฐœ์˜ ์  ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ = L2 ๊ฑฐ๋ฆฌ ๋งจํ•˜ํƒ„ ์œ ์‚ฌ๋„ (Menhattan Similarity) -> ์‚ฌ๊ฐ ๊ฒฉ์ž ์ตœ๋‹จ ๊ฑฐ๋ฆฌ = L1 ๊ฑฐ๋ฆฌ ์ž์นด๋“œ ์œ ์‚ฌ๋„ (Jaccard Similarity) -> ๊ต์ง‘ํ•ฉ๊ณผ ํ•ฉ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๋กœ ๊ณ„์‚ฐ ๋‘ ๋ฌธ์žฅ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, ๋‘ ๋ฌธ์žฅ์ด ์„œ๋กœ ์–ผ๋งˆ๋‚˜ ์œ ์‚ฌํ•œ์ง€ ๋‚˜ํƒ€๋‚ด์ฃผ๋Š” ๊ธฐ๋ฒ• ์•„๋ž˜์—์„œ ์ž…๋ ฅ๊ฐ’์œผ๋กœ ๋ฐ›๋Š” Sentences๋Š” ["Hello World", "Hello Word"] ํ˜•์‹์ด๋‹ค. ### ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ### def cos_performance(sentences) : tfidf_vectorizer = TfidfVecto..
Learning Rate Scheduler
ยท
Artificial_Intelligence๐Ÿค–/etc
Learning rate๋Š” backpropagation ๊ณผ์ •์—์„œ ๋ชจ๋ธ์˜ weight์ธ gradient์˜ ๋ณ€ํ™”(์—…๋ฐ์ดํŠธ์˜ ๋ณดํญ step-size)์ด๋‹ค. ์—ญ์ „ํŒŒ ๊ณผ์ •์—์„œ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜(Weight)๋Š” ์†์‹ค ํ•จ์ˆ˜์˜ ์˜ค๋ฅ˜ ์ถ”์ •์น˜๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์—…๋ฐ์ดํŠธ๋œ๋‹ค. ํ•™์Šต๋ฅ  * ์ถ”์ • ๊ฐ€์ค‘์น˜ ์˜ค๋ฅ˜(๊ฐ€์ค‘์น˜์— ๋Œ€ํ•œ ๊ธฐ์šธ๊ธฐ or ์ „์ฒด ์˜ค๋ฅ˜ ๋ณ€ํ™”) >>>> Weight ์—…๋ฐ์ดํŠธ Learning rate๋Š” Optimizer๊ฐ€ Loss function์˜ ์ตœ์†Œ๊ฐ’์— ๋„๋‹ฌํ•˜๋„๋ก ๋งŒ๋“œ๋Š” ๋ณ€ํ™”์˜ ํฌ๊ธฐ๋ฅผ ์ œ์–ดํ•œ๋‹ค. ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์š”์†Œ์ธ learning rate๋ฅผ ์ž˜๋ชป ์„ค์ •ํ•˜๋ฉด ์•„์˜ˆ ํ•™์Šต์ด ์•ˆ๋  ์ˆ˜๋„ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ชจ๋ธ ํ•™์Šต์—์„œ๋Š” learning rate๋ฅผ ์–ด๋–ป๊ฒŒ ์„ค์ •ํ•  ์ง€๊ฐ€ ๋งค์šฐ ์ค‘์š”ํ•œ ์š”์†Œ์ด๋‹ค. ํ•™์Šต๋ฅ ์ด ํฌ๋ฉด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•จ. ..
์˜๋ฏธ๋ก ์  ๊ด€๊ณ„ ์ถ”๋ก ์„ ์œ„ํ•œ ๋ฌธ์žฅ ์œ ์‚ฌ์„ฑ ์ „์ด ํ•™์Šต ๋ฐฉ๋ฒ•
ยท
Artificial_Intelligence๐Ÿค–/Research
์˜๋ฏธ๋ก ์  ๊ด€๊ณ„ ์ถ”๋ก ์„ ์œ„ํ•œ ๋ฌธ์žฅ ์œ ์‚ฌ์„ฑ ์ „์ด ํ•™์Šต ๋ฐฉ๋ฒ• A Sentence Similarity Transfer Learning Method for Inference of Semantic Relation 2022 ํ•œ๊ตญ์ฐจ์„ธ๋Œ€์ปดํ“จํŒ…ํ•™ํšŒ ํ•™์ˆ ๋Œ€ํšŒ https://www.earticle.net/Article/A412361 ์˜๋ฏธ๋ก ์  ๊ด€๊ณ„ ์ถ”๋ก ์„ ์œ„ํ•œ ๋ฌธ์žฅ ์œ ์‚ฌ์„ฑ ์ „์ด ํ•™์Šต ๋ฐฉ๋ฒ• ์ตœ๊ทผ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„์„์˜ ์š”๊ตฌ๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ๋ฌธ์žฅ์˜ ์˜๋ฏธ์  ๊ด€๊ณ„๋ฅผ ์ดํ•ดํ•˜๊ณ  ์‚ฌ์šฉ์ž๊ฐ€ ์š”๊ตฌํ•˜๋Š” ๋ถ„์„ ์ •๋ณด๋ฅผ ์ œ๊ณต ํ•˜๊ธฐ ์œ„ํ•œ ์ถ”๋ก  ๊ธฐ๋ฒ•์˜ ํ•„์š”์„ฑ์ด ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์˜๋ฏธ ๊ด€๊ณ„ ์ถ” www.earticle.net
๋ฏธ์ˆ  ์น˜๋ฃŒ ๋ถ„์„์„ ์œ„ํ•œ AI๊ธฐ๋ฐ˜ ์˜ˆ์ˆ  ์ž‘ํ’ˆ ์ œ์ž‘ ๋ฐ ์ฆ๋Œ€ ๋ฐฉ์•ˆ
ยท
Artificial_Intelligence๐Ÿค–/Research
๋ฏธ์ˆ  ์น˜๋ฃŒ ๋ถ„์„์„ ์œ„ํ•œ AI๊ธฐ๋ฐ˜ ์˜ˆ์ˆ  ์ž‘ํ’ˆ ์ œ์ž‘ ๋ฐ ์ฆ๋Œ€ ๋ฐฉ์•ˆ 13th Workshop on Convergent and Smart Media System (CSMS) 2021.12.27
Disease Orange Dataset Classification
ยท
Artificial_Intelligence๐Ÿค–/Computer Vision
์ด์ „์— ๋งŒ๋“ค์—ˆ๋˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์ œ์ž‘ ์ž๋ฃŒ์ž…๋‹ˆ๋‹ค.
Confusion Matrix(ํ˜ผ๋™ํ–‰๋ ฌ) ๊ตฌํ˜„
ยท
Artificial_Intelligence๐Ÿค–/etc
๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋”ฐ์ง€๋Š” ์ง€ํ‘œ๋Š” ์ •ํ™•๋„(Accuracy)๋งŒ ์žˆ๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋‹ค. Confusion Matrix์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ฐ ๋ชจ๋ธ์˜ ์ง„๋‹จ,๋ถ„๋ฅ˜,ํŒ๋ณ„,์˜ˆ์ธก ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๊ณ ์•ˆ๋œ ํ‘œ๋กœ, ์˜ค๋ฅ˜ํ–‰๋ ฌ(error matrix)์ด๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๊ธฐ๋„ ํ•œ๋‹ค. ์„ฑ๋Šฅ์ง€ํ‘œ๋กœ ์ •ํ™•๋„(Accuracy), ์ •๋ฐ€๋„(Precision), ๋ฏผ๊ฐ๋„(=์žฌํ˜„์œจ Recall), F1score ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, confusion matrix๋กœ ํ‘œํ˜„์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค. # ํ˜ผ๋™ํ–‰๋ ฌ import matplotlib.pyplot as plt my_data = [] y_pred_list = [] for data in prediction_list : for data2 in data : my_data.append(data2.item()) for data in la..
How to use "Hugging Face"(ํ—ˆ๊น…ํŽ˜์ด์Šค) for NLP Task
ยท
Artificial_Intelligence๐Ÿค–/etc
ํ—ˆ๊น…ํŽ˜์ด์Šค๋Š” Tensorflow Hub์™€ ์œ ์‚ฌํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š” ๊ณณ์ด๋‹ค. ํŠธ๋žœ์Šคํฌ๋จธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋“ค์ด ์กด์žฌํ•˜๋ฉฐ, ๊ฐ๊ฐ์˜ Task์— ๋งž๊ฒŒ ๋ฏธ์„ธ์กฐ์ •์„ ์ง„ํ–‰ํ•œ ๋ชจ๋ธ๋“ค ๋˜ํ•œ ๊ตฌ์ถ•๋˜์–ด์žˆ๋‹ค. ๋˜ํ•œ, ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ ๋ชจ๋ธ์— ๋งž๊ฒŒ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ ํŽธ๋ฆฌํ•˜๋„๋ก Tokenizer๋„ ์ „๋ถ€ ๊ตฌํ˜„๋˜์–ด์žˆ๋‹ค. ๊ทธ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ํ•™์Šต์„ ์œ„ํ•ด ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ์…‹๋„ ์ €์žฅ๋˜์–ด ์žˆ์–ด, ์‚ฌ์šฉ์ž๋Š” ๊ทธ์ € ๊ฐ€์ ธ์˜จ ๋’ค ์‚ฌ์šฉํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋œ๋‹ค. ์ฆ‰, ํ—ˆ๊น…ํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ธฐ์กด ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ์—์„œ ๋ฐ˜๋ณต๋˜๋Š” ๋ชจ๋“  ๋ถ€๋ถ„์„ ์ผ์ผ์ด ๋”ฐ๋กœ ๊ตฌํ˜„ํ•˜์ง€ ์•Š์•„๋„ ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ฐ์ดํ„ฐ ๊ตฌ์ถ•๋ถ€ํ„ฐ ์ „์ฒ˜๋ฆฌ, ๋ชจ๋ธ ํ•™์Šต ๋ฐ ๊ฒฐ๊ณผ ๋„์ถœ๊นŒ์ง€ ๋งค์šฐ ํŽธ๋ฆฌํ•˜๊ณ  ํšจ์œจ์ ์œผ๋กœ ์ฝ”๋”ฉํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ํ—ˆ๊น…ํŽ˜์ด์Šค๋Š” ๋‹ค์–‘ํ•œ ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ๊ณผ ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ชจ๋“ˆ๋กœ, ..
Liky
'Artificial_Intelligence๐Ÿค–' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (3 Page)